LLM Fundamentals | Visual Explainer

Training is how a model learns from data. It repeatedly predicts the next token, compares to the correct answer (loss), and updates its internal numbers (parameters) so predictions get better.

Training loop (simplified)

Data

Text (or pairs) the model learns from

→

Tokenize

Turn text into token IDs

→

Forward

Model predicts next token

→

Loss

Compare prediction to correct answer

→

Backward

Compute gradients

→

Update

Adjust billions of parameters

Repeat over huge datasets for many steps. "Parameters" are the numbers being updated; more parameters = more capacity to memorize patterns.

In practice

Training needs huge datasets, lots of compute (GPUs), and many steps. You don’t usually train from scratch; you fine-tune an existing model on your data.

🏋️ Chapter 21: Training a Model