← Guide

🌱 Beginner β€” AI Fundamentals

Chapter 5 of 24

πŸ“– Chapter 5: The Journey From Text to AI

From raw text to vectors step by step

Let's start simple. Input text:

Input text

"The cat sat on the mat"
1Tokenization (Breaking Into Pieces)
Thecatsatonthemat

AI does NOT see text like we do. It sees chunks called tokens.

2Convert Tokens to IDs

IDs

464282833323192622603

These numbers are just labels (e.g. cat→5, dog→12). The numbers themselves mean nothing.

3Convert IDs into Embeddings

Each word becomes a vector like [0.234, -0.891, 0.445, ..., 0.672]. Usually 768–1536 dimensions. Now the word has mathematical meaning.

4The Model Understands Meaning

The model doesn't understand words. It understands vector geometry.