Sovereignty on borrowed ground
Article 12 · Series: A Local Coding Agent with apfel
For eleven articles we have built an agent whose model runs on your own Mac. No API key, no cloud endpoint, no billing per token. That sounds like independence. Before we place the agent into Xcode for the finale, a sober question is worth asking: what does that independence actually rest on? Two releases from a single week in June 2026 help answer it, because they mark the two edges between which apfel sits. This is an assessment, not a build article. The code stays as it is.
The local model stays the warm-up
On 8 June 2026 Anthropic released a Swift package that connects Apple’s Foundation Models framework to Claude (source: Anthropic blog, 2026-06-08). The pattern is laid out plainly: the on-device model handles fast local tasks, summarisation, extraction, typed Swift values via @Generable. As soon as a request calls for multi-step reasoning, code generation or a web search, the package hands off to Claude, in the cloud, against an Anthropic API key. The answer streams back into the same SwiftUI view.
Technically this is elegant, and for many apps it is exactly right. For our question it is mostly instructive. Here a vendor-provided tool uses the local model for serious work, and the path ends at the cloud provider. The Foundation Model is the warm-up there, not the agent, precisely the role this series denies it. We spent eleven articles showing that the small model suffices for an agent if you build the tools right. The official tool assumes it does not, and pushes the real work to Claude.
Borrowed and small
apfel rests on the same ground as the Anthropic package, the Foundation Models framework. And therefore on decisions we cannot influence. Apple determines which model ships, how large it is, when it changes and how stable its interface stays. The model is closed-source. We cannot inspect it, cannot swap it, cannot freeze it.
This is not an abstract worry. We have run into the consequences several times in this series. The 4096-token context window bounded our loops from the start and made the context manager necessary (Article 8). The model’s guardrails let harmless requests through one moment and blocked them the next, with no discernible pattern (Article 7). Both are behaviour Apple can shift with a macOS update without asking us. An agent that runs reliably today may react differently after the next update. Build on a closed platform and you build on moving ground, and you see the movement only once it has happened.
On top of the closedness comes plain size. The model has a reported three billion parameters (Article 6), and you feel it in agent use. It summarises, explains and translates reliably, but on multi-step planning it gives up, and it hits a tool call’s arguments only about two times in three, which is why we made editing workable across Articles 4 to 7 only with constrained output. This model suffices for an agent only because we keep the tools tightly led and the tasks small. Article 6 measured the boundary: small local edits yes, serious multi-step work no. So we borrow not just a model that can change, but a small one. Against that stands a real advantage, and it is more than convenience: this model is everywhere. Apple counts over two billion active devices, and on the growing, Apple-Intelligence-capable subset it sits in the platform itself, not in an app you install first. Build on it and you reach every one of those devices with no download, no key, no setup. The price is paid in performance, not in reach.
The open alternative costs effort
On that same 8 June, at WWDC26, Apple showed what a fully local agentic stack without the Foundation Model looks like (source: Apple, WWDC26 session “Run local agentic AI on the Mac using MLX”, 2026-06-08). The stack has four layers: MLX as the compute foundation, MLX-LM for loading and quantising models, an MLX-LM server, and any agent on top. The server is OpenAI-compatible and understands structured tool calling, “a drop-in replacement for any cloud LLM API”. That is exactly the pattern apfel-serve gave us, and the one this whole series builds on.
The difference is the model. With MLX you choose it yourself, from thousands of open models, and you can inspect it, swap it, freeze it. Two observations from the session stand out for us. First, for agentic work Apple explicitly promotes MLX with open models, not the Foundation Model. So the series uses precisely the model Apple itself does not position as an agent backend. Second, the session speaks of “hundreds of thousands of tokens” per agentic session, while we have been working with 4096. The tight context limit is not a property of local AI as such, but of this one model that Apple chose.
This openness has a price, and it is real, in two currencies. The first is effort: installation, a model choice, quantisation, a download of several gigabytes. The second is hardware, and it scales with the model. A good, quantised model, clearly more capable than apfel’s Foundation Model, runs decently today on a single current Mac with enough unified memory; that is the realistic path for most. Only the largest frontier models burst a single machine, and for those the session shows the high end: Macs with 512 GB of RAM and the biggest models spread across several Macs over Thunderbolt (Apple, WWDC26, 2026-06-08). The hurdle is therefore not a monster on the desk, but a graded effort, more RAM and a little setup for a clearly better model. apfel’s Foundation Model demands none of this and runs on every current Mac, because it is small. The very thing that makes it weak also makes it frugal.
The triangle
Between these poles the position of the apfel series becomes legible. Three ways to use a local, or locally beginning, model for an agent, each with a price:
| Path | Model | Place | Hardware | Effort | Lock-in |
|---|---|---|---|---|---|
| MLX + open model | freely chosen, open, inspectable, any size | fully local | scales with the model: one current Mac for good quantised ones, several for frontier size | setup, download, quantisation | none |
| Anthropic package | Foundation Model as warm-up, then Claude | local + cloud handoff | low locally, the cloud carries the load | low | API key, provider |
| apfel / Foundation Model | set by Apple, closed, small | fully local | low, every current Mac | practically none | platform (Apple) |
MLX is the most sovereign path and the most involved, in effort and a bit of hardware; in return it runs freely chosen, clearly more capable models. The Anthropic package is the most convenient and borrows its power from the cloud. apfel is the lowest-threshold one, runs on any machine and stays entirely on the device, and pays for it with a small model on borrowed, closable ground. None of the three gets everything: performance, openness, low barrier, those are three corners, and you pick two.
The middle path and its limit
apfel sits in the middle of this triangle, and that is a deliberate choice, not an embarrassment. The value of the middle path is the low threshold: an agent that runs at once on any Apple silicon Mac, no key, no download, no provider behind it and no special hardware. For a way into agentic coding, for small local tasks with a tool that costs nothing and gives nothing away, that is a genuinely good starting point. For more, the small model hits exactly the limits Article 6 measured, and then the path leads to a larger model, that is to MLX with a bit more RAM and setup or back to the cloud.
The honesty that belongs with it is the assessment of that path. Sovereignty on your own device is real, as long as you do not mistake it for independence. Our data does not leave the Mac, that is true and valuable. But the model that processes it is not ours, and the platform that ships it can change it tomorrow. Whoever wants real independence, a model they own and control, ends up at open weights and MLX, and pays in effort. This series deliberately chose the other price, and it is fair to name both.
Yet what we learned is not borrowed, and neither is the agent itself. It does not know the Foundation Model at all. It speaks the OpenAI protocol against apfel-serve, a deliberate decision from Article 2, and the MLX-LM server with its open models speaks exactly that protocol too. A model change is therefore no rebuild but a backend swap: set up the open model via MLX, point the base URL at it, done. The loop, the tools, the constrained editing, the three surfaces, all of it moves over unchanged. Over twelve articles we did not understand one model, but how to build an agent, and that knowledge runs on any model that speaks the same protocol. If Apple shifts the foundation tomorrow, we carry the agent, with some effort, onto another.
Conclusion
The local agent is no final victory for sovereignty, and it claims to be none. It is a workable middle path: local, free, immediately available, and its model at the same time bound to a decision made in Cupertino, not by us. The Anthropic package shows where the convenient path leads, back to the cloud. MLX shows that the fully free path is open on the same device, for effort. apfel makes the in-between usable, and because we built against the protocol, the way to another model stays open at any time. In the final step of the series we put this agent where the cloud assistants usually sit: into Xcode.
Previous article: The agent in the browser. Next article: the local agent in Xcode (placeholder, link finalized when Article 13 is published). This article is a conceptual interlude with no repo tag of its own; the code stays at v1.0.