Understanding Embeddings
An embedding is a conversion of text into a vector of numbers. This is the foundation of modern RAG systems.
This section explains:
- What embeddings are (intuition + what they look like)
- How embedding models work (from Word2Vec to modern transformers)
- Why embeddings capture meaning (the geometry of language)
- How to choose an embedding model (trade-offs and comparisons)
Key Insight
An embedding is just a vector of numbers that represents the meaning of a word or document. Vectors that are geometrically close to each other represent similar meanings.
Topics Covered
- What are Embeddings — Concrete examples and intuition
- Embedding Models — Word2Vec, GloVe, BERT, Sentence Transformers
- Vector Spaces — High-dimensional geometry and implications
Why Embeddings Matter for RAG
- Semantic search: Find documents with similar meaning, not just matching keywords
- Speed: Comparing vectors is fast; comparing full texts is slow
- Relevance: Embeddings capture nuanced meaning that keyword search misses
- Cross-lingual: Same meaning, different languages, can be close in embedding space
But Remember...
Embeddings have limitations: - ❌ Excellent for capturing meaning - ❌ Terrible for exact identifiers (Order #1766 ≈ Order #1767 in embedding space) - ❌ Limited by context window of embedding model - ❌ Biases in training data are captured in embeddings
This is exactly why RAG systems need hybrid search (semantic + keyword). We'll revisit this in The Exact Match Problem.
Reading Path
I recommend reading in order:
- Start here (done!)
- What are Embeddings
- Embedding Models
- Vector Spaces
Then move to Similarity Search.
Key Takeaway: Embeddings are powerful but imperfect. They excel at capturing semantic meaning but fail for structured data. Understanding their strengths and weaknesses is crucial for building good RAG systems.