Training
How a model goes from knowing nothing to talking about everything.
The analogy
Think of training a doctor: first, years reading the whole library (pre-training), then supervised practice with cases corrected by tutors (instruction tuning), and finally feedback from real patients that polishes their manner (refinement from human preferences).
In detail
Typical training has phases: self-supervised pre-training on trillions of tokens (predicting the next word), supervised fine-tuning on quality examples (SFT) and reinforcement learning from human feedback (RLHF) to align behavior. Training a large model costs millions; using it (inference) costs cents.
An example
The same base model that completes “Paris is the capital of…” learns, after tuning, to answer politely, refuse harmful requests and stick to the format you ask for.