Evaluating Good Search Part I: Measure It

Source: https://medium.com/@dtunkelang/evaluating-good-search-part-i-measure-it-5507b2dbf4f6 Author: Daniel Tunkelang Series: Evaluating Search (Part 1 of 4)

Summary

Opening of Tunkelang’s four-part series on search evaluation. Anchored in Lord Kelvin’s principle: “If you cannot measure it, you cannot improve it.” Covers the full taxonomy of supervised and unsupervised search metrics.

Supervised Metrics (require judgment labels)

MetricWhat it measures
PrecisionFraction of returned results that are relevant
RecallFraction of all relevant results that were returned
Precision@kPrecision for top k results only
Average Precision@kWeighted avg giving more weight to top-ranked results
NDCGAccounts for relevance gradations + position discounting

Unsupervised Metrics (from behavior)

MetricWhat it measures
CTRFraction of searches that receive clicks
MRRWeighted click signal favoring earlier positions
ConversionsStronger signal than clicks (purchase, signup, etc.)

Key Insight

Conversions are sparse but strong; clicks are plentiful but noisy. Individual components (spelling correction, autocomplete) need their own targeted metrics.

Series

  1. Measure It (this article)
  2. Measuring Searcher Behavior
  3. Evaluating Search - Using Human Judgments (already processed)
  4. When There’s No Conversion Rate

People