Tools the agent doesn't write: MCP

Tools the agent doesn't write: MCP

Article 10 · Series: A Local Coding Agent with apfel

The tool problem has two halves. One is writing tools: read, write, execute, each as code in the agent. The other is letting the model operate them: it has to pick the right tool and hit its arguments. So far we have carried both halves ourselves. The Model Context Protocol takes the first off our hands. Tools move into separate servers, and the agent only attaches them. This article shows what that looks like with apfel concretely, and where MCP reaches its limit. The state is frozen as tag v0.10.

A correction up front

The plan for this series had us build an MCP client into the agent. On checking, that turned out wrong, and the correction is instructive enough not to hide. apfel already has MCP built in. The option --mcp <path|url> attaches a local or remote MCP server to apfel, in prompt mode and in serve mode. apfel is the MCP client. The agent builds none. That changes the build target: not a client, but a small server of our own.

What MCP is

An MCP server provides tools. It speaks a simple protocol: to tools/list it answers with the tools it offers, including name, description and the schema of the arguments; to tools/call it runs a tool and returns the result. The client, here apfel, discovers the tools and makes them available to the model. The gain is a separation: whoever writes a tool writes a server, once, and any MCP-capable client can use it. The tool no longer lives in the agent.

A small MCP server in a few lines

We build the smallest possible server: an add tool that adds two numbers, over stdio, with the official MCP Swift SDK. Two handlers suffice, one for tools/list, one for tools/call:

import MCP

let server = Server(
    name: "calc-mcp", version: "0.1.0",
    capabilities: .init(tools: .init(listChanged: false))
)

await server.withMethodHandler(ListTools.self) { _ in
    let add = Tool(
        name: "add",
        description: "Add two integers and return the sum.",
        inputSchema: .object([
            "type": .string("object"),
            "properties": .object([
                "a": .object(["type": .string("integer")]),
                "b": .object(["type": .string("integer")]),
            ]),
            "required": .array([.string("a"), .string("b")]),
        ])
    )
    return .init(tools: [add])
}

await server.withMethodHandler(CallTool.self) { params in
    let a = params.arguments?["a"]?.intValue ?? 0
    let b = params.arguments?["b"]?.intValue ?? 0
    return .init(content: [.text(text: "\(a + b)", annotations: nil, _meta: nil)], isError: false)
}

let transport = StdioTransport()
try await server.start(transport: transport)
await server.waitUntilCompleted()

The inputSchema is an ordinary JSON schema, here two integers a and b, both required. This is exactly the schema the model sees later when it calls the tool. A deterministic protocol test confirms the server, with no model at all:

$ ./scripts/mcp-smoke.sh
SMOKE OK: tools/list lists add; add(20,22)=42

Attaching and discovering

We attach the built server to apfel. --mcp takes a path to a local stdio server (or a URL for remote):

apfel --serve --port 11578 --mcp "$(pwd)/.build/debug/calc-mcp"

apfel runs the handshake, calls tools/list and takes over the tool. In the log:

mcp: …/calc-mcp - add

From now on the local endpoint knows the add tool, without a line of it sitting in the agent.

Two modes, one endpoint

In serve mode apfel behaves differently depending on the request, and both modes are useful.

If the client sends a chat request without its own tools, apfel orchestrates the tool loop itself. It calls add internally and returns the finished answer. The client notices nothing of the tool.

If the client sends its own tools along, apfel passes the MCP tools through as well and leaves the calls to the client. A request that needs add then yields a tool call to the client:

"tool_calls": [
  { "function": { "name": "add", "arguments": "{\"a\": 17, \"b\": 25}" } }
]

Own and MCP tools coexist on the same endpoint. The client can drive the loop itself, exactly like the tool round-trip from Article 4.

Where the model misses the argument keys

So far MCP solves the first half of the problem cleanly. The second half remains. Let apfel drive the loop internally, and the same pattern as in Articles 4 to 7 shows up: the small model does not reliably hit the argument keys. Measured across three calls of “add 31 and 11” (own measurement v0.10):

add({"value1": 31, "value2": 11}) = 0
add({"a": 31, "b": 11}) = 42
add({"a": 31, "b": 11}) = 42

Once in three the model misses the keys. The schema asks for a and b, the model writes value1 and value2. The server gets no values for a and b, takes the default zero, and correctly returns 0. The tool is not at fault, the server does exactly what it is told. The model told it the wrong thing. With add it shows, because the model knows the right answer itself and supplies it verbally. With a tool whose result the model does not know, it would not show, and the agent would carry on with a wrong number.

The bridge to Article 7

This is exactly the weakness Article 7 already solved, only elsewhere. Constrained output forces the model’s reply into a schema, and the invented keys vanish. apfel does not apply that to MCP arguments, so they come back. But in the client-orchestrated mode our agent gets the tool calls itself and can apply the same discipline it applies to its own tools.

A clear picture falls into place: MCP and constrained output solve different halves and do not interfere. MCP handles provisioning, a server delivers the tool, no agent code. Constrained output handles operation, the model hits the arguments because the schema forces them. Together they make an agent that uses external tools and operates them reliably.

When to build, when to use MCP

The decision follows from the separation. A tool tightly coupled to the agent, one that knows the sandbox, the gate or the edit workflow, stays hand-built; it lives on exactly that closeness. A tool that wraps a self-contained capability, a database, a Git repo, an external API, is better off as an MCP server: written once, usable by any client, versioned independently of the agent. The rule of thumb is the boundary of the capability. What only our agent needs we write ourselves. What many could need becomes a server.

Demo repo: apfel-coding-agent v0.10

The state of this article is frozen as tag v0.10: https://codeberg.org/rotecodefraktion/apfel-coding-agent/src/tag/v0.10

Try the MCP server

Check out the tag and build the server:

git clone https://codeberg.org/rotecodefraktion/apfel-coding-agent.git
cd apfel-coding-agent
git checkout v0.10
swift build --product calc-mcp
./scripts/mcp-smoke.sh        # deterministic protocol test, no model

New in v0.10 over v0.9:

  • Sources/calc-mcp/ — the calculator MCP server (MCP Swift SDK)
  • config/mcp-servers.md — attaching apfel with --mcp, both modes, limits
  • docs/adr/005-mcp-statt-eigenbau.md
  • scripts/mcp-smoke.sh

Attach to apfel and call it through the model:

apfel --serve --port 11578 --mcp "$(pwd)/.build/debug/calc-mcp"
# in a second terminal, send a chat request that uses "add"

The smoke test checks the server deterministically over the protocol; the run against apfel shows the model calling it, including the argument fuzziness from this article.

Conclusion

MCP takes writing the tools off the agent, not operating them. That separation is the real lesson: provisioning and operation are two problems, and they have two solutions, a server and constrained output. The agent can now use tools it did not build itself, and stays what it was from the start, an OpenAI-compatible client against a local model. In the next step we put this agent behind a server and give it an interface that is more than a terminal.


Previous article: The interactive terminal session. Next article: the agent behind a Hummingbird server (placeholder, link finalized when Article 11 is published). Repo tag: v0.10.