Thoughts on Search Result Diversity

Source: https://dtunkelang.medium.com/thoughts-on-search-result-diversity-1df54cb5bf4a Author: Daniel Tunkelang

Summary

A deeper treatment of diversity in ranked retrieval. Tunkelang distinguishes diversity as ambiguity hedging (the query might mean multiple things) versus novelty seeking (redundancy reduction even for unambiguous queries).

Two Reasons to Diversify

Ambiguity — “jaguar” could be the car, the animal, or the OS. A diverse list covers multiple intents.
Redundancy — Even for clear queries, near-duplicate results waste rank positions.

KL-Divergence Approach

One framing: the ideal result list should match the distribution of relevant documents across subtopics. Measure divergence between the list’s topic distribution and the corpus’s topic distribution using KL-divergence.

Greedy Reranking

Practical implementation:

Retrieve top-N with standard relevance ranking
Greedily reorder: at each position, pick the next item that maximizes relevance + novelty (MMR-style)
Optionally cap items per subtopic (categorical diversity)

Subtopic Coverage Metrics

α-nDCG — penalizes redundant relevant documents at lower ranks
ERR-IA — intent-aware Expected Reciprocal Rank
D-measure — explicit intent coverage

Hard vs. Soft Diversity

Hard: limit N results per category (e.g., max 2 results per domain)
Soft: score-based MMR with tunable λ

Diversity and faceted search are complementary: facets let users explicitly constrain intent; diversity hedges when they don’t.

Key Concepts

Ambiguity hedging — covering multiple interpretations
Redundancy reduction — avoiding near-duplicate results
KL-divergence — measuring topic distribution mismatch
Greedy reranking — practical MMR implementation
α-nDCG — diversity-aware evaluation metric

People

Daniel Tunkelang

Awesome Search KG

Explorer

Thoughts on Search Result Diversity

Thoughts on Search Result Diversity

Summary

Two Reasons to Diversify

KL-Divergence Approach

Greedy Reranking

Subtopic Coverage Metrics

Hard vs. Soft Diversity

Relationship to Facets

Key Concepts

People

Graph View

Table of Contents

Backlinks

Awesome Search KG

Explorer

Thoughts on Search Result Diversity

Thoughts on Search Result Diversity

Summary

Two Reasons to Diversify

KL-Divergence Approach

Greedy Reranking

Subtopic Coverage Metrics

Hard vs. Soft Diversity

Relationship to Facets

Key Concepts

Related Concepts

Related Articles

People

Graph View

Table of Contents

Backlinks