Awesome Search — Information Retrieval & Search Knowledge Graph
Hello, I am Andrew.
I’ve been building e-commerce search applications for 15+ years. Over that time, I’ve collected and connected ideas from publications, conference talks, books, research papers, blog posts, and practitioners across the information retrieval ecosystem.
This knowledge graph maps many of the resources that have influenced my thinking, organized by topic and interconnected through shared concepts. Because search is inherently multidisciplinary, many resources are linked to multiple areas of the graph, reflecting how ideas from ranking, relevance, user behavior, machine learning, evaluation, and system design often overlap.
⭐ Star us on GitHub — it helps!
Semantic knowledge graph built from the Awesome Search curated list. Contains article notes (for paywalled articles, only summaries and key concepts are included), concept notes, topic notes, people notes, case study notes, and company notes, all interconnected through wikilinks.
Maps of Content (Entry Points)
| Domain | MOC |
|---|---|
| Agentic Search & Embeddings | MOC - Agentic Search and Embeddings |
| Search Quality & Query Understanding | MOC - Search Quality Assurance and Query Understanding |
| Ranking & Retrieval | MOC - Ranking and Retrieval |
| Search UX & Discovery | MOC - Search UX and Discovery |
| Case Studies | MOC - Case Studies |
| Architecture & Search Team | MOC - Architecture and Search Team |
Core Concepts by Domain
Retrieval
BM25 · Dense Vector Retrieval · Sparse Vector Retrieval · Hybrid Search · Reciprocal Rank Fusion · Relative Score Fusion · Semantic Boosting · Semantic Search · SIRA
Embeddings
Bi-Encoder · Cross-Encoder · ColBERT · Late Interaction · Matryoshka Embeddings · SPLADE · ELSER · Task-Aware Embeddings · Hypothetical Document Embeddings · Dimensionality Reduction · PCA · t-SNE · UMAP · Vector Quantization · Scalar Quantization · Binary Quantization · TurboQuant
Ranking
Learning to Rank · Personalization · Position Bias · Diversity Metrics · Retrieval Pipeline · Results Boosting · Results Merchandising · Signal Downboosting
Evaluation
NDCG · MRR · MAP · Precision and Recall · UDCG · Search Evaluation · Judgment Lists · LLM as Judge · Session-Based Evaluation · Click Signals · Pointwise Relevance Evaluation · Pairwise Relevance Evaluation · Listwise Relevance Evaluation
Query Understanding
Query Understanding · Query Types · Search Intent · Query Segmentation · Synonyms · Spelling Correction · Autocomplete · Faceted Search · Zero Results · Collocations · Query Relaxation
Architecture & RAG
Search Architecture · Knowledge Graph Search · RAG · Agentic Search · Search-R1 · Reinforcement Learning for Search · Vector Filtering · Text Chunking · Clean Context
Topics
Practice-oriented guides — how to DO or deal with something in search.
Search Quality Assurance · A-B Testing for Search · Managing a Search Team · Understaffed Search Team · Hiring for Search · Economics of Search · E-commerce Search · Autocomplete and Autosuggest · Search Result Diversity · Synonyms and Vocabulary Management · Query Understanding in Practice · Multilingual Search · Relevance Program Setup · Personalization in Search · Conversational and Agentic Search · Spelling Correction in Search · Dimensionality Reduction vs Quantization · Elasticsearch vs OpenSearch
Tools
Quepid · Querqy · Elasticsearch · Qdrant Vector DB · Weaviate Vector DB
Companies
Technology Providers Elastic · Vespa · Meta · Cohere · OpenSource Connections · Algolia · Weaviate · searchHub · Empathy · Sease · MongoDB · Voyage AI · Qdrant · Hornet
End Users Uber · Airbnb · Zalando · Slack · Canva · Netflix · Twitter · Etsy · Skyscanner · Grubhub · Spotify · Carousell · Vinted · Shopify · Otto
Case Studies
Uber Eats - Scaling Search for Food Delivery · Airbnb - ML-Powered Experiences Ranking · Zalando - Self-DoS via Facet Aggregation · Slack - Enterprise Message Search with LTR · Etsy - Search Quality and Query Understanding · Skyscanner - Learning to Rank for Flights · Netflix - Content Search Architecture · Canva - Search Pipeline Modernization
Key People
Daniel Tunkelang · Doug Turnbull · James Rubinstein · Omar Khattab · Jo Kristian Bergum · Trey Grainger · Andreas Wagner · Giovanni Fernandez-Kincade · Wolf Garbe · Eugene Yan
Stats
- ~136+ article notes
- ~82 concept notes (incl. PCA, t-SNE, UMAP, Dimensionality Reduction, TurboQuant, RaBitQ, BBQ, HNSW, SQ, BQ, Search-R1)
- 16 topic notes (incl. Dimensionality Reduction vs Quantization, Elasticsearch vs OpenSearch)
- ~49 people notes (incl. Geoffrey Hinton, Laurens van der Maaten)
- 8 case study notes
- 28 company nodes
- 5 tool notes (Quepid, Querqy, Elasticsearch, Qdrant Vector DB, Weaviate Vector DB)
- 6 Maps of Content
See History for the full note-addition log.