apfel from the command line

apfel from the command line

Article 1 · Series: A Local Coding Agent with apfel

Before we build an agent, we get to know the tool it sits on. apfel is a small Swift CLI that exposes Apple’s Foundation Model on macOS 26 as a command line, an OpenAI-compatible HTTP server, and an interactive chat. We install apfel via Homebrew, walk through the three modes in order, prompt, serve, chat, and write a handful of small shell scripts along the way that will later serve as building blocks for the agent. At the end we have a demo repo on Codeberg pinned to tag v0.1 and a first qualitative impression of what the small on-device model handles well and where it gives way.

Verifying Requirements

apfel assumes a very specific platform. We verify it before installing.

sw_vers                   # macOS 26.3 (Tahoe) or newer
uname -m                  # arm64
xcodebuild -version       # Xcode 26.4 or newer (optional)

The hard requirements: an Apple Silicon Mac (M1 or newer), macOS 26.3 (Tahoe), and Apple Intelligence enabled in System Settings. Without Apple Intelligence enabled, apfel returns exit code 5 (“Model unavailable”) on every call. Xcode 26.4+ is only needed if we later want to build the HEAD variant of apfel from source; for the Homebrew bottle it’s not required.

ComponentState at v0.1What for
HardwareApple Silicon (M1+)arm64 architecture, Neural Engine
macOS26.3 (Tahoe)Foundation Models framework
Apple IntelligenceenabledModel is otherwise not unlocked
HomebrewcurrentInstallation package manager
Xcode (optional)26.4Source build of apfel

Installation

brew install apfel
apfel --version           # apfel v1.5.1
apfel --release           # detailed release and build info
apfel --help              # all modes and flags at a glance

The Homebrew formula pulls the current bottle. State of tag v0.1 is apfel 1.5.1; the version lives in this article’s frontmatter and in docs/setup.md of the demo repo. When the version jumps in later articles, we name the jump explicitly in the article body.

apfel --help is the most important first read. It shows the three modes — prompt, --serve, --chat — as primary uses and lists the flags with descriptions. The USAGE line is binding:

USAGE:
  apfel [OPTIONS] <prompt>       Send a single prompt
  apfel -f <file> <prompt>       Attach file content to prompt
  apfel --chat                   Interactive conversation
  apfel --serve                  Start OpenAI-compatible HTTP server

Flags come before the positional prompt argument. apfel -s "..." "..." is correct; apfel "..." -s "..." silently drops what doesn’t fit. That sounds trivial; we return to it in “Setup Pitfalls.”

Prompt Mode

Prompt mode is a single self-contained request. We pass a string, apfel sends it to the local model, and writes the response to stdout.

apfel "What is a closure in Swift, in two sentences?"

This is the building block everything else sits on. The agent loop we build later is at its core a loop of such calls, with added context, tool calling, and confirmation gates.

A first useful example instead of “Hello World” is a commit message suggestion from a staged diff:

git diff --cached | apfel \
  -s "You write Conventional Commits in one line, max 60 characters, lowercase except proper nouns, no trailing period." \
  "Write a fitting message for this diff."

stdin carries the diff, the positional prompt steers behavior, -s sets the role. Three mechanics in one line, and that’s at its core what we’ll encapsulate as a tool call in the agent.

JSON Output and the Script Pattern

The scripting pattern is built into apfel: -o json switches from plain-text response to structured JSON, letting responses pipe cleanly through jq.

apfel -o json "Explain higher-order functions in one sentence." | jq -r '.content'

This exact pattern lives in the demo repo as examples/cli/04-json-pipe.sh. It’s three lines long and shows how an apfel call becomes a UNIX tool that fits into pipes.

For scripts, apfel’s clean exit codes are useful:

CodeMeaning
0Success
1Runtime error
2Usage error (bad flags)
3Guardrail blocked (content policy)
4Context overflow (input too long)
5Model unavailable (Apple Intelligence not enabled)
6Rate limited / busy

A script calling apfel can react to 5 by pointing the user to Apple Intelligence, to 4 by retrying with shorter context, to 3 by logging the guardrail. That’s more structure than many cloud CLIs offer.

The Foundation Model does not respond deterministically. For reproducible smoke tests there’s --seed <n>. When we later write tests against model behavior in the agent, --seed is the anchor.

System Prompt and File Input

-s "<role>" sets a system prompt that defines persona or output format. File content reaches the model on two documented paths, both are in apfel --help.

Variant A: -f as the apfel-native flag.

apfel -f notes.md "Summarize the following content in three sentences."
apfel -f a.txt -f b.txt "Compare these two files."

-f is repeatable; multiple files attach in one request.

Variant B: stdin (pipe or input redirect).

apfel "Summarize the following content in three sentences." < notes.md
cat notes.md | apfel "Summarize the following content in three sentences."

Both work. The demo scripts in the repo use stdin redirect (< file) because it has no external dependency and chains well with other UNIX tools. -f is the more compact form in the multi-file case.

One thing the small on-device model teaches us right away: for code tasks the system prompt must explicitly focus on the given input, otherwise the model readily invents a different piece of code and explains that instead. What works:

apfel \
  -s "You are a precise senior developer. Explain ONLY the code provided. Do not invent other code, do not write your own variant." \
  "Explain what this code does." \
  < fibonacci.swift

The script examples/cli/02-explain-code.sh does exactly that. Without the anti-hallucination addition in the system prompt, the model was repeatedly more creative than necessary in our first smoke test.

Chat Mode

apfel --chat opens an interactive session in the terminal. The model holds context across multiple turns until we end the session or the context window overflows.

apfel --chat -s "You are a calm coding assistant. Answer briefly and clearly."

apfel ships with strategies for managing context as the session grows long:

  • newest-first (default) — oldest turns get evicted first
  • oldest-first — newer turns yield, oldest stay
  • sliding-window with --context-max-turns <n> — fixed number of turns
  • summarize — apfel compresses older turns on its own
  • strict — error on overflow, no automatic trimming

--context-status enables a display after each turn that reports the context-window fill level. It’s one of the most useful flags for understanding the on-device model: we see directly when we’re hitting the limit.

In the demo repo, a thin wrapper script examples/cli/07-chat-session.sh sets the system prompt to a brief-and-clear style and keeps the default context strategy. Chat mode is the sandbox where we can later trace the Plan/Act/Observe mechanics of the agent most directly.

A Taste of Serve Mode

apfel --serve starts an HTTP server on 127.0.0.1:11434 with an OpenAI-compatible API.

apfel --serve
# Server listens on http://localhost:11434/v1

From another terminal:

curl -s http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "apple-foundationmodel",
    "messages": [{"role": "user", "content": "Explain higher-order functions in one sentence."}]
  }' | jq -r '.choices[0].message.content'

This is the bridge our Swift agent will dock to in Article 3. /v1/chat/completions with streaming via SSE, /v1/models, and /health are the most important endpoints. apfel describes itself as a drop-in replacement for any SDK expecting an OpenAI endpoint — which makes serve mode the actual door opener for our own tools.

We only step in briefly here. Article 2 takes serve apart in detail: all endpoints, token auth, CORS, everything that’s just a list in --help at this point.

Where the Model Holds Up, Where It Gives Way

After the smoke test with the demo scripts, a first qualitative impression takes shape. This is not an eval, we run that in Article 6 with a task canon and source labels per number. But it shows a rough line.

The model holds up:

  • Summarizing medium text segments (a note of a few sentences gets cleanly mirrored into three sentences)
  • Translation German ↔ English in both directions, idiomatically
  • Code explanation when the system prompt focuses on the provided code
  • Diff review in the 3-point format, when the task is clearly scoped
  • Concept explanations on programming topics (higher-order functions, recursion, closures)

The model gives way:

  • Geographic and political fact questions in German. “Was ist die Hauptstadt von Österreich?” returned a marketing reflex about “websites of the responsible authorities” in our measurement. The same question in English, “What is the capital of Austria?”, returns “Vienna” cleanly. The guardrails fire language-asymmetrically¹.
  • Generic “name three …” prompts trigger the marketing reflex even in English. --permissive loosens the filters but doesn’t always help here.
  • Code hallucination: without an explicit focus prompt, on code tasks the model readily invents its own code instead of explaining the one provided.

¹ Own measurement 2026-06-02 with apfel 1.5.1 on macOS 26.3, documented in the series buildlog.

The model is small (Apple Machine Learning Research mentions around three billion parameters² for the on-device variant of the first Apple Intelligence generation) and stochastic: the same call can return a clean answer one time and a deflection the next. For reproducible tests we set --seed. For tasks that need to run reliably, an anti-hallucination prompt with clear focus on the input is worth the effort.

² Apple Machine Learning Research (June 2024, updated July 2024): “Introducing Apple’s On-Device and Server Foundation Models”, https://machinelearning.apple.com/research/introducing-apple-foundation-models.

Demo Repo: apfel-coding-agent v0.1

The state at the end of this article is frozen on Codeberg as tag v0.1: https://codeberg.org/rotecodefraktion/apfel-coding-agent/src/tag/v0.1. Anyone following along finds everything we did here — the seven example scripts, the setup doc, and the CLAUDE.md with the series conventions.

Setting up apfel-coding-agent v0.1

Clone and check out the tag:

git clone https://codeberg.org/rotecodefraktion/apfel-coding-agent.git
cd apfel-coding-agent
git checkout v0.1
chmod +x examples/cli/*.sh

Contents at tag v0.1:

  • README.md — series link and quick start
  • CLAUDE.md — conventions for code sessions (language, stack, path layout)
  • LICENSE — MIT
  • .gitignore — Swift/macOS standard ignores
  • docs/setup.md — installation, USAGE rules, file-input variants, exit codes, language-asymmetric guardrails, state snapshot
  • examples/cli/ — seven small scripts:
    • 01-summarize-notes.sh — summarize a note (stdin redirect)
    • 02-explain-code.sh — explain code with an anti-hallucination prompt
    • 03-suggest-commit-message.sh — Conventional Commit from git diff --cached
    • 04-json-pipe.sh-o json | jq as a script pattern
    • 05-translate.sh — translate with a system prompt
    • 06-explain-diff.sh — diff review in the 3-point format
    • 07-chat-session.sh — interactive chat session with a default system prompt

First test that everything responds:

echo "The series builds a local coding agent in Swift on top of apfel." | apfel "Summarize this in one sentence."

When a short summary comes back, the installation is through.

Setup Pitfalls

Three traps that cost time on the first run, as a take-away for anyone following along later.

Argument order. apfel’s USAGE prescribes apfel [OPTIONS] <prompt>. Flags must come before the positional prompt. apfel "..." -f file.md ignores the file; apfel -f file.md "..." is correct. A misread of the --help output that cost the scripts all their file inputs on the first attempt.

Language asymmetry of the guardrails. The capital-of-Austria question in German hits a more restrictive filter than the same question in English. For smoke tests in a German-language series this means: switch fact questions to English, or keep the system prompt in English. In the demo repo all scripts use English system prompts; user inputs (notes, code, diffs) are language-neutral.

Code hallucination without focus. When we ask the model to explain a piece of code without specifying in the system prompt “explain ONLY the code provided,” it readily invents a different one and explains that with confidence. The anti-hallucination addition in 02-explain-code.sh is a correctness measure, not a style preference.

How It Continues

Article 2 takes serve mode apart in detail: all three endpoints, the Chat Completions schema with SSE streaming, token auth and CORS, the anatomy of an OpenAI drop-in. With that we have the foundation our Swift client can dock to in Article 3.


Previous article: The Model Is Already There — A Prologue to the Local Coding Agent. Next article: Serve Mode and the OpenAI Protocol (placeholder — link finalized when Article 2 is published). Repo tag: v0.1.