LLM Fundamentals
How language models really work — from tokens and embeddings through neural networks to the Transformer architecture and fine-tuning.
ChatGPT, Claude, Gemini — language models have become an integral part of everyday life. But what exactly are language models, and what happens under the hood? In this series, I explain step by step how language models work — from the fundamental concepts to the complete architecture. Eight articles, each building on the last, plus a bonus chapter where all the building blocks come together into a working mini language model.
- 1
The Next Word — How Language Models Work
What happens between input and output? Tokens, probability distributions, and sampling strategies — explained step by step, with real code.18. Apr 2026 - 2
Words as Points in Space — What Embeddings Are
How language models encode meaning in numbers. Embedding tables, cosine similarity, vector arithmetic, and why King minus Man plus Woman equals Queen.18. Apr 2026 - 3
Neural Networks from Scratch
What happens between embedding and logit. Neurons, layers, forward pass, and activation functions — fully implemented in numpy, no framework magic.21. Apr 2026 - 4
Backpropagation — How a Model Learns
How neural networks learn from errors. Loss, gradients, chain rule, gradient descent — backpropagation implemented by hand on a 2-layer MLP that learns XOR and token prediction.22. Apr 2026 - 5
Context and RNNs — Why Order Matters
Why language needs memory, and how the first language models learned that context. Recurrent networks, LSTMs, and the long-sentence problem, explained without a formula storm.24. Apr 2026 - 6
Attention Is All You Need
How the bottleneck of RNNs was replaced by a mechanism that links every token to every other. Query, Key, Value, Multi-Head, and the paper that flipped the NLP world, explained with library metaphors and just enough math for the curious.30. Apr 2026 - 7
The Transformer, the Complete Architecture
Position, depth, stability — what gets added on top of attention to make a complete transformer. Positional encodings, feed-forward layers, residual connections, layer normalization, the entire block in Python, and the leap to GPT, BERT, Llama, and Claude.3. May 2026 - 8
Fine-Tuning: From Base Model to Assistant
How a base model that completes text becomes a helpful assistant. Supervised Fine-Tuning, RLHF, DPO, and Constitutional AI — the last piece of the LLM pipeline, with the candid question of what alignment actually solves.5. May 2026 - 9
Bonus: Building Lal — A Small Base Model from the Series' Building Blocks
Eight articles of theory, one bonus chapter of practice. We combine all the code fragments from the LLM series into a working mini language model, train it on TinyShakespeare, and tack on a tiny SFT step. With a wink to Star Trek TNG.7. May 2026