Understanding Embeddings

An embedding is a conversion of text into a vector of numbers. This is the foundation of modern RAG systems.

This section explains:

What embeddings are (intuition + what they look like)
How embedding models work (from Word2Vec to modern transformers)
Why embeddings capture meaning (the geometry of language)
How to choose an embedding model (trade-offs and comparisons)

Key Insight

An embedding is just a vector of numbers that represents the meaning of a word or document. Vectors that are geometrically close to each other represent similar meanings.

Topics Covered

What are Embeddings — Concrete examples and intuition
Embedding Models — Word2Vec, GloVe, BERT, Sentence Transformers
Vector Spaces — High-dimensional geometry and implications

Why Embeddings Matter for RAG

Semantic search: Find documents with similar meaning, not just matching keywords
Speed: Comparing vectors is fast; comparing full texts is slow
Relevance: Embeddings capture nuanced meaning that keyword search misses
Cross-lingual: Same meaning, different languages, can be close in embedding space

But Remember...

Embeddings have limitations:

❌ Excellent for capturing meaning
❌ Terrible for exact identifiers (Order #1766 ≈ Order #1767 in embedding space)
❌ Limited by context window of embedding model
❌ Biases in training data are captured in embeddings

This is exactly why RAG systems need hybrid search (semantic + keyword). We'll revisit this in The Exact Match Problem.

Reading Path

I recommend reading in order:

Then move to Similarity Search.

Key Takeaway: Embeddings are powerful but imperfect. They excel at capturing semantic meaning but fail for structured data. Understanding their strengths and weaknesses is crucial for building good RAG systems.