Bonus: Building Lal — A Small Base Model from the Series' Building Blocks

LLM Fundamentals

How language models really work — from tokens and embeddings through neural networks to the Transformer architecture and fine-tuning.

ChatGPT, Claude, Gemini — language models have become an integral part of everyday life. But what exactly are language models, and what happens under the hood? In this series, I explain step by step how language models work — from the fundamental concepts to the complete architecture. Eight articles, each building on the last, plus a bonus chapter where all the building blocks come together into a working mini language model.

  1. 1

    The Next Word — How Language Models Work

    What happens between input and output? Tokens, probability distributions, and sampling strategies — explained step by step, with real code.
    18. Apr 2026
  2. 2

    Words as Points in Space — What Embeddings Are

    How language models encode meaning in numbers. Embedding tables, cosine similarity, vector arithmetic, and why King minus Man plus Woman equals Queen.
    18. Apr 2026
  3. 3

    Neural Networks from Scratch

    What happens between embedding and logit. Neurons, layers, forward pass, and activation functions — fully implemented in numpy, no framework magic.
    21. Apr 2026
  4. 4

    Backpropagation — How a Model Learns

    How neural networks learn from errors. Loss, gradients, chain rule, gradient descent — backpropagation implemented by hand on a 2-layer MLP that learns XOR and token prediction.
    22. Apr 2026
  5. 5

    Context and RNNs — Why Order Matters

    Why language needs memory, and how the first language models learned that context. Recurrent networks, LSTMs, and the long-sentence problem, explained without a formula storm.
    24. Apr 2026
  6. 6

    Attention Is All You Need

    How the bottleneck of RNNs was replaced by a mechanism that links every token to every other. Query, Key, Value, Multi-Head, and the paper that flipped the NLP world, explained with library metaphors and just enough math for the curious.
    30. Apr 2026
  7. 7

    The Transformer, the Complete Architecture

    Position, depth, stability — what gets added on top of attention to make a complete transformer. Positional encodings, feed-forward layers, residual connections, layer normalization, the entire block in Python, and the leap to GPT, BERT, Llama, and Claude.
    3. May 2026
  8. 8

    Fine-Tuning: From Base Model to Assistant

    How a base model that completes text becomes a helpful assistant. Supervised Fine-Tuning, RLHF, DPO, and Constitutional AI — the last piece of the LLM pipeline, with the candid question of what alignment actually solves.
    5. May 2026
  9. 9

    Bonus: Building Lal — A Small Base Model from the Series' Building Blocks

    Eight articles of theory, one bonus chapter of practice. We combine all the code fragments from the LLM series into a working mini language model, train it on TinyShakespeare, and tack on a tiny SFT step. With a wink to Star Trek TNG.
    7. May 2026