LLM Fundamentals | Visual Explainer

Token IDs are just labels (e.g. 2828 for "cat") with no mathematical meaning. An embedding is a vector of numbers (e.g. 768 or 1536 dimensions) that the model assigns to each token (or to a sequence). The key property: similar meaning → vectors that are close in space (small distance or large cosine similarity). So "cat" and "kitten" get similar vectors; "cat" and "pizza" get very different ones. The model doesn’t read words — it uses this geometry for similarity search, RAG, and internal reasoning.

The pipeline: from raw text to graph

Before the model can use meaning, text is cleaned (normalized), tokenized, turned into IDs, then into vectors via an embedding layer or model. Those vectors live in high-dimensional space; we visualize 2D slices or 3D projections (e.g. via PCA) so humans can see clusters.

From text to embedding space

1. Raw text

(THE CAFE!!)→

2. Clean

(the cafe)→

3. Tokenize

(the | cafe)→

4. Embed

([0.2, -0.8, …])→

5. Graph

(→ point in space)

2D embedding space (similar = close). Axes = two of many dimensions.

cat/kitten/dog cluster together; pizza/food separate; run/walk nearby. Real embeddings use hundreds of dimensions (we show 2).

Real embedding space has many axes (e.g. 768 or 1536). Here: x, y, z as a concept.

Each word = one point (x, y, z, …). Similar words sit close in this space.

Meaning = geometry

In vector space: "cat" is close to "kitten" and "dog", and far from "pizza". Distance or angle between vectors is used for retrieval (find the closest docs to a query) and for the model’s own "understanding" of token relationships.

3D embedding space (rotating)

Loading 3D…

Same words (cat, kitten, dog, pizza, …) as 3D points. Similar meaning = close in space.

Example: How embeddings are used

RAG: Your docs are chunked and each chunk is embedded. When the user asks a question, the question is embedded and you search for the closest chunks. Those chunks are passed to the LLM as context. Semantic search: Same idea — query and items are embedded; you return the top-k by similarity. Clustering: Group similar items by embedding distance (e.g. support tickets by topic).

Try the Embeddings & Similarity simulator for interactive closest-word search and top-k.

📐 Chapter 11: What Is an Embedding?