Setting Up a Relevance Evaluation Program

Source: https://medium.com/@jamesrubinstein/setting-up-a-relevance-evaluation-program-c955d32fba0e Author: James Rubinstein

Summary

Step-by-step guide to building a human relevance judgment program from scratch — the operational backbone of a serious Search Evaluation practice.

The Six Steps

1. Understand Your User Task

What are users actually doing? Information gathering, shopping, navigation, comparison. Task type determines what “relevant” means.

2. Select Evaluation Methodology

Result set preference — which set of results is better overall?
Document preference — is doc A or doc B more relevant?
Binary relevance — relevant or not?
Graded relevance — 4-point scale (Perfect → Excellent → Good → Poor)

A 4-point scale is practical for most cases.

3. Gather Queries

Use weighted random sampling where weights = query frequency. This avoids over-representing outlier tail queries while still including some tail coverage. See Succeeding with Relevance Evaluation using PPS Sampling.

4. Collect Documents

Run sampled queries through your search engine; capture the result sets to be judged.

5. Recruit Raters

“The single most important part of a human relevance evaluation program is the humans.”

Match rater expertise to domain: lawyers for legal, makeup enthusiasts for cosmetics. Options: internal staff, consulting firms, crowdsourcing platforms.

6. Execute Ratings

Use gold-rater verification tiers: standard raters escalate disagreements to expert judges. Capture structured judgments, not free text.

People

James Rubinstein

Awesome Search KG

Explorer

Setting Up a Relevance Evaluation Program

Setting Up a Relevance Evaluation Program

Summary

The Six Steps

1. Understand Your User Task

2. Select Evaluation Methodology

3. Gather Queries

4. Collect Documents

5. Recruit Raters

6. Execute Ratings

People

Graph View

Table of Contents

Backlinks

Awesome Search KG

Explorer

Setting Up a Relevance Evaluation Program

Setting Up a Relevance Evaluation Program

Summary

The Six Steps

1. Understand Your User Task

2. Select Evaluation Methodology

3. Gather Queries

4. Collect Documents

5. Recruit Raters

6. Execute Ratings

Related Concepts

People

Related Articles

Graph View

Table of Contents

Backlinks