1.2 Probability, Tokens & Model Behavior

Generative AI

Models don’t read sentences; they read tokens: subwords like
"in", "gen", "er", "ation", "AI", "##s"

Tokenization affects:

Bad tokenization → broken prompts, hallucinations, cutoff words.

Every time the model generates a word, it performs:

This loop happens hundreds to thousands of times per prompt.

These control creativity and stability:

If you don’t understand sampling, you won’t understand model behavior.

Because a model must predict something even when: