Awesome Search — Information Retrieval & Search Knowledge Graph

Hello, I am Andrew.

I’ve been building e-commerce search applications for 15+ years. Over that time, I’ve collected and connected ideas from publications, conference talks, books, research papers, blog posts, and practitioners across the information retrieval ecosystem.

This knowledge graph maps many of the resources that have influenced my thinking, organized by topic and interconnected through shared concepts. Because search is inherently multidisciplinary, many resources are linked to multiple areas of the graph, reflecting how ideas from ranking, relevance, user behavior, machine learning, evaluation, and system design often overlap.

⭐ Star us on GitHub — it helps!

Semantic knowledge graph built from the Awesome Search curated list. Contains article notes (for paywalled articles, only summaries and key concepts are included), concept notes, topic notes, people notes, case study notes, and company notes, all interconnected through wikilinks.

Maps of Content (Entry Points)

DomainMOC
Agentic Search & EmbeddingsMOC - Agentic Search and Embeddings
Search Quality & Query UnderstandingMOC - Search Quality Assurance and Query Understanding
Ranking & RetrievalMOC - Ranking and Retrieval
Search UX & DiscoveryMOC - Search UX and Discovery
Case StudiesMOC - Case Studies
Architecture & Search TeamMOC - Architecture and Search Team

Core Concepts by Domain

Retrieval

BM25 · Dense Vector Retrieval · Sparse Vector Retrieval · Hybrid Search · Reciprocal Rank Fusion · Relative Score Fusion · Semantic Boosting · Semantic Search · SIRA

Embeddings

Bi-Encoder · Cross-Encoder · ColBERT · Late Interaction · Matryoshka Embeddings · SPLADE · ELSER · Task-Aware Embeddings · Hypothetical Document Embeddings · Dimensionality Reduction · PCA · t-SNE · UMAP · Vector Quantization · Scalar Quantization · Binary Quantization · TurboQuant

Ranking

Learning to Rank · Personalization · Position Bias · Diversity Metrics · Retrieval Pipeline · Results Boosting · Results Merchandising · Signal Downboosting

Evaluation

NDCG · MRR · MAP · Precision and Recall · UDCG · Search Evaluation · Judgment Lists · LLM as Judge · Session-Based Evaluation · Click Signals · Pointwise Relevance Evaluation · Pairwise Relevance Evaluation · Listwise Relevance Evaluation

Query Understanding

Query Understanding · Query Types · Search Intent · Query Segmentation · Synonyms · Spelling Correction · Autocomplete · Faceted Search · Zero Results · Collocations · Query Relaxation

Architecture & RAG

Search Architecture · Knowledge Graph Search · RAG · Agentic Search · Search-R1 · Reinforcement Learning for Search · Vector Filtering · Text Chunking · Clean Context

Topics

Practice-oriented guides — how to DO or deal with something in search.

Search Quality Assurance · A-B Testing for Search · Managing a Search Team · Understaffed Search Team · Hiring for Search · Economics of Search · E-commerce Search · Autocomplete and Autosuggest · Search Result Diversity · Synonyms and Vocabulary Management · Query Understanding in Practice · Multilingual Search · Relevance Program Setup · Personalization in Search · Conversational and Agentic Search · Spelling Correction in Search · Dimensionality Reduction vs Quantization · Elasticsearch vs OpenSearch

Tools

Quepid · Querqy · Elasticsearch · Qdrant Vector DB · Weaviate Vector DB

Companies

Technology Providers Elastic · Vespa · Meta · Cohere · OpenSource Connections · Algolia · Weaviate · searchHub · Empathy · Sease · MongoDB · Voyage AI · Qdrant · Hornet

End Users Uber · Airbnb · Zalando · Slack · Canva · Netflix · Twitter · Etsy · Skyscanner · Grubhub · Spotify · Carousell · Vinted · Shopify · Otto

Case Studies

Uber Eats - Scaling Search for Food Delivery · Airbnb - ML-Powered Experiences Ranking · Zalando - Self-DoS via Facet Aggregation · Slack - Enterprise Message Search with LTR · Etsy - Search Quality and Query Understanding · Skyscanner - Learning to Rank for Flights · Netflix - Content Search Architecture · Canva - Search Pipeline Modernization

Key People

Daniel Tunkelang · Doug Turnbull · James Rubinstein · Omar Khattab · Jo Kristian Bergum · Trey Grainger · Andreas Wagner · Giovanni Fernandez-Kincade · Wolf Garbe · Eugene Yan

Stats

  • ~136+ article notes
  • ~82 concept notes (incl. PCA, t-SNE, UMAP, Dimensionality Reduction, TurboQuant, RaBitQ, BBQ, HNSW, SQ, BQ, Search-R1)
  • 16 topic notes (incl. Dimensionality Reduction vs Quantization, Elasticsearch vs OpenSearch)
  • ~49 people notes (incl. Geoffrey Hinton, Laurens van der Maaten)
  • 8 case study notes
  • 28 company nodes
  • 5 tool notes (Quepid, Querqy, Elasticsearch, Qdrant Vector DB, Weaviate Vector DB)
  • 6 Maps of Content

See History for the full note-addition log.