Lab 6: Hybrid Search – Combining Semantic & Keyword Matching¶
Level: Intermediate-Advanced | Duration: 2.5 hours
Objective¶
Implement production-ready hybrid search combining dense (semantic) and sparse (keyword) retrieval.
What You'll Learn¶
- Implement BM25 sparse retrieval algorithm
- Understand TF-IDF scoring
- Implement Reciprocal Rank Fusion (RRF)
- Combine multiple ranking signals
- Compare hybrid vs pure semantic search
- Optimize and tune ranking parameters
- Evaluate with relevant metrics
Hybrid Search Flow¶
Query
↓
├─→ Dense Retrieval (Semantic) → Top-k results + scores
├─→ Sparse Retrieval (BM25) → Top-k results + scores
↓
Combine Rankings (RRF)
↓
Final Ranked List
↓
Return to User
When to Use Hybrid¶
- ✅ Exact matches important (IDs, codes)
- ✅ Mix of semantic + keyword queries
- ✅ Domain-specific terminology
- ✅ Production systems needing robustness
- ✅ When either semantic or keyword alone fails