LLM Fundamentals

How language models really work — from tokens and embeddings through neural networks to the Transformer architecture and fine-tuning.

ChatGPT, Claude, Gemini — language models have become an integral part of everyday life. But what exactly are language models, and what happens under the hood? In this series, I explain step by step how language models work — from the fundamental concepts to the complete architecture. Eight articles, each building on the last, plus a bonus chapter where all the building blocks come together into a working mini language model.

1
The Next Word — How Language Models Work
What happens between input and output? Tokens, probability distributions, and sampling strategies — explained step by step, with real code.
18. Apr 2026
2
Words as Points in Space — What Embeddings Are
How language models encode meaning in numbers. Embedding tables, cosine similarity, vector arithmetic, and why King minus Man plus Woman equals Queen.
18. Apr 2026
3
Neural Networks from Scratch
What happens between embedding and logit. Neurons, layers, forward pass, and activation functions — fully implemented in numpy, no framework magic.
21. Apr 2026
4
Backpropagation — How a Model Learns
How neural networks learn from errors. Loss, gradients, chain rule, gradient descent — backpropagation implemented by hand on a 2-layer MLP that learns XOR and token prediction.
22. Apr 2026
5
Context and RNNs — Why Order Matters
Why language needs memory, and how the first language models learned that context. Recurrent networks, LSTMs, and the long-sentence problem, explained without a formula storm.
24. Apr 2026
6
Attention Is All You Need
How the bottleneck of RNNs was replaced by a mechanism that links every token to every other. Query, Key, Value, Multi-Head, and the paper that flipped the NLP world, explained with library metaphors and just enough math for the curious.
30. Apr 2026
7
The Transformer, the Complete Architecture
Position, depth, stability — what gets added on top of attention to make a complete transformer. Positional encodings, feed-forward layers, residual connections, layer normalization, the entire block in Python, and the leap to GPT, BERT, Llama, and Claude.
3. May 2026
8
Fine-Tuning: From Base Model to Assistant
How a base model that completes text becomes a helpful assistant. Supervised Fine-Tuning, RLHF, DPO, and Constitutional AI — the last piece of the LLM pipeline, with the candid question of what alignment actually solves.
5. May 2026
9
Bonus: Building Lal — A Small Base Model from the Series' Building Blocks
Eight articles of theory, one bonus chapter of practice. We combine all the code fragments from the LLM series into a working mini language model, train it on TinyShakespeare, and tack on a tiny SFT step. With a wink to Star Trek TNG.
7. May 2026

LLM Fundamentals

The Next Word — How Language Models Work

Words as Points in Space — What Embeddings Are

Neural Networks from Scratch

Backpropagation — How a Model Learns

Context and RNNs — Why Order Matters

Attention Is All You Need

The Transformer, the Complete Architecture

Fine-Tuning: From Base Model to Assistant

Bonus: Building Lal — A Small Base Model from the Series' Building Blocks