Demystifying nDCG and ERR

Source: https://opensourceconnections.com/blog/2019/12/09/demystifying-ndcg-and-err/
Publisher: OpenSource Connections

Summary

A clear explainer contrasting NDCG with ERR (Expected Reciprocal Rank) — two ranking metrics that model user behavior differently. Aimed at search practitioners who have heard of both but aren’t sure when to use each.

NDCG Review

NDCG assumes:

All relevant documents are worth finding
Position matters (logarithmic discount)
Documents don’t “compete” — each has independent value

This is the independent utility model: each document contributes its relevance independently.

ERR (Expected Reciprocal Rank)

ERR models users as a cascade: they read results from top to bottom, and their probability of being satisfied at rank i depends on not having been satisfied at ranks 1 through i-1.

ERR = Σᵢ  (1/i) × Rᵢ × Πⱼ<ᵢ (1 - Rⱼ)

Where Rᵢ is the probability of satisfaction at rank i (function of relevance grade).

Intuition: If the first result is perfect (Rᵢ = 1.0), the user is satisfied and doesn’t look further. A highly relevant document at rank 3 contributes almost nothing if ranks 1 and 2 are also highly relevant.

NDCG vs. ERR: Key Difference

Scenario	NDCG	ERR
Perfect result at rank 1, more relevant at rank 5	Rewards rank 5 (adds to DCG)	Minimal reward (user already satisfied)
Two good results at rank 1 and 2	Sums both	Rank 2 partially discounted (user may stop at 1)
Ambiguous query needing multiple angles	Rewards diversity of good results	Stops at first satisfaction
Known-item search	Equivalent to MRR	Similar behavior

When to Use Each

NDCG is better for:

Research evaluation (multiple relevant docs expected)
Exploratory/informational queries
E-commerce (user wants to see multiple options)

ERR is better for:

Navigational queries (user wants one specific result)
Known-item search
When “one great result > two good results” is the right model

Normalized vs. Unnormalized

Both NDCG and ERR can be computed normalized (by ideal) or unnormalized:

Normalized: [0,1] range, enables comparison across query sets with different difficulty
Unnormalized DCG: measures absolute quality; harder to compare across queries

Flavors of NDCG — NDCG formulation variants
Measuring Search - Metrics Matter — metric selection in practice
Session vs Query based Search Evals — ERR relates to session models

NDCG — primary metric discussed
MRR — related cascade model
Search Evaluation — where these fit
Session-Based Evaluation — ERR models session-like behavior

Awesome Search KG

Explorer

Demystifying nDCG and ERR

Demystifying nDCG and ERR

Summary

NDCG Review

ERR (Expected Reciprocal Rank)

NDCG vs. ERR: Key Difference

When to Use Each

Normalized vs. Unnormalized

Graph View

Table of Contents

Backlinks

Awesome Search KG

Explorer

Demystifying nDCG and ERR

Demystifying nDCG and ERR

Summary

NDCG Review

ERR (Expected Reciprocal Rank)

NDCG vs. ERR: Key Difference

When to Use Each

Normalized vs. Unnormalized

Related Articles

Related Concepts

Graph View

Table of Contents

Backlinks