MMR — Maximal Marginal Relevance

Note: MMR (Maximal Marginal Relevance) is a diversity metric and reranking strategy. Do not confuse with MRR (Mean Reciprocal Rank), a ranking quality metric.

A reranking method that balances relevance and novelty. Instead of ranking purely by relevance, MMR penalizes results that are similar to already-selected results — reducing redundancy while preserving quality.

Formula

Where:

  • = candidate result set
  • = already-selected results
  • = query
  • = query-document relevance (e.g., cosine similarity)
  • = document-document similarity
  • = tradeoff parameter (0 = pure diversity, 1 = pure relevance)

Algorithm

Greedy iterative selection:

  1. Select the most relevant document as the first result
  2. For each subsequent position, pick the document that maximizes the MMR score — highest relevance minus penalty for similarity to already-chosen documents
  3. Repeat until the result list is full

Properties

  • Tunable: λ controls the relevance/diversity tradeoff
  • Greedy: not globally optimal, but O(n) per step — practical for reranking
  • Similarity-agnostic: works with any similarity function (BM25 score, embedding cosine, etc.)

Use Cases

  • Search result diversification: reduce ten near-duplicate product results in favor of different categories
  • Keyphrase extraction: select diverse keyphrases from a document rather than all from the same topic cluster
  • RAG context selection: choose diverse passages for LLM context to avoid redundant information
  • Facet result pages: ensure different facet values are represented in top results

Tradeoffs

Relevance-focused (λ→1)Diversity-focused (λ→0)
Top results are most relevantTop results cover more aspects
May show redundant contentMay surface lower-relevance items
Better for known-item searchBetter for exploratory / ambiguous queries
  • APD — Average Pairwise Distance, a passive diversity measurement (vs MMR’s active reranking)
  • Diversity Metrics — overview of diversity approaches
  • Reranking — MMR is applied as a post-retrieval reranking step