The Model Is Already There — A Prologue to the Local Coding Agent

Prologue · Series: A Local Coding Agent with apfel

On every Apple Silicon Mac running macOS 26, a language model is running that we barely treat as one. It sits in the system, drafts email replies, summarizes notes, rewrites text passages — but as a programmable tool it remains out of reach. apfel exposes this model as a command line and as an OpenAI-compatible server. That’s the starting point of this series: we build a coding agent in Swift that uses this local model as a backend — no API key, no cloud endpoint, no token bill. Over twelve articles we work our way from the CLI to integration with Xcode 26.3 via the Model Context Protocol. Two critical interludes along the way examine what the small on-device model can handle and what platform dependency we accept by going local.

The Model Already Living in the Mac

Apple opened up the Foundation Models framework with macOS 26 — the same model lineage that has been feeding the Writing Tools, Mail summaries, and other Apple Intelligence features with language¹. Requirements: an Apple Silicon Mac (M1 or newer), macOS 26, and Apple Intelligence enabled in System Settings. The model lives on the device — no download needed, no network connection required. What we have been missing is a convenient path from the terminal or from our own code to this model.

¹ Apple Developer: FoundationModels framework, https://developer.apple.com/documentation/foundationmodels (accessed 2026-06-02).

What apfel Opens Up

apfel is a small Swift CLI, installable via Homebrew. It knows three modes.

Prompt mode for one-shot requests:

apfel "summarize the contents of this file" -f notes.md

Serve mode as a local HTTP server speaking the OpenAI-compatible Chat Completions API:

apfel --serve
# Server listens locally, exact port covered in Article 2

This lets any tool that expects an OpenAI endpoint redirect to the local model — from scripts to more complex agents.

Chat mode for an interactive session in the terminal:

apfel --chat

Chat mode keeps context across multiple turns, limited by the model’s context window. We work through the exact command options and version-specific quirks in Article 1.

What Local Means

Local means: no API key in a .env file, no endpoint that changes its pricing tomorrow, no telemetry going to a vendor. Data stays on the device. For a series building a coding agent, that’s more than a technical detail — code that runs through the agent doesn’t leave the machine.

Local also means: no cloud pace. An on-device model with a few billion parameters has a different speed and a different capacity than the heavily scaled models we may silently measure against in our heads. Asking Claude or ChatGPT a question implicitly benchmarks against clusters that spend two orders of magnitude more compute per answer. That comparison isn’t the standard here — the standard is what works sensibly local.

Where the Local Model Runs Thin

The Foundation Model in macOS 26 is small. Third-party sources report an order of magnitude around three billion parameters² — roughly the level Qwen-3-4B or Llama-3-3B marked a year ago. The context window is small too, well below the 128k or 200k we know from the cloud models. What that means in practice we measure in Article 6: small, local edits — yes. Multi-step plans with five files of context — wobbly to no.

² Apple Machine Learning Research (10 June 2024, updated 29 July 2024): „Introducing Apple’s On-Device and Server Foundation Models", https://machinelearning.apple.com/research/introducing-apple-foundation-models — Apple’s wording: „a ~3 billion parameter on-device language model". An official spec for the macOS 26 iteration is not public as of 2026-06; concrete measurement in the evaluation of Article 6.

The second point is the foundation apfel rests on. apfel uses Apple’s official FoundationModels framework — a documented Swift API, no undocumented paths. What remains is the platform dependency itself: Apple decides when the Foundation Model is updated, in which direction it evolves, how stable the interface stays. The model is closed-source — no audit of training data, no fine-tuning, no cross-platform fallback. A macOS update can shift its behavior. That’s no different with other providers — cloud models also run closed-source, without audit, bound to a single vendor’s platform. Article 11 takes this tension up in detail.

We mention this not to qualify the series before it has begun — we mention it so it’s clear what we’re building on. Sovereignty on a foundation that follows a single platform decision is a different kind of sovereignty than a fully open model runtime.

What This Series Is and Isn’t

The series is a study of the agent loop. We build a tool that can talk to a model on its own machine, one that uses tools, reads files, shows diffs, asks for confirmation, works through plans. Across the twelve main articles the path leads from the first CLI response to integration with Xcode 26.3 — three different surfaces on the same AgentCore.

The series is not a Claude Code clone. We’re not building a product that aims to compete with cloud agents. We’re building a tool that shows what the mechanics underneath such agents look like — and that, within the limits of what works locally, runs decently. The series is not a winner’s comparison either. When we measure the local model against a cloud agent in Article 6, it’s about deployment profiles, not scoreboards.

Outside the arc: MLX directly and alternative model runtimes (cross-reference to the Hummingbird series), VS Code integrations, multi-agent orchestration. And anything that requires a platform other than Apple Silicon with macOS 26.

Joining In

The demo code grows openly on Codeberg: rotecodefraktion/apfel-coding-agent. One git tag per article (v0.1 through v1.1) freezes the state at the end of each article. To follow along:

Apple Silicon Mac, macOS 26 (Tahoe), Apple Intelligence enabled in System Settings
brew install apfel (exact command in Article 1)
Clone the repo, check out the tag per article, follow along

The exact version states (apfel, macOS, Xcode) live in the frontmatter of each article and in docs/setup.md of the demo repo. When a later article requires an apfel update, we name the jump explicitly.

How It Continues

Article 1 takes apfel itself apart: installation, the three modes in detail, JSON output for scripts, system prompts, file input. Before we build the agent, we get to know the tool it sits on.

Next article: apfel from the command line (placeholder — link will be finalized when Article 1 is published). Repo tag: none (concept article).