Learning to Rank

Definition

Learning to Rank (LTR) is the application of machine learning to the ranking problem in information retrieval — training models to produce optimal rankings given a query and a set of candidate documents, using labeled (query, document, relevance) training data.

Why LTR?

Hand-tuned ranking formulas (BM25 weights, field boosts) are:

  • Brittle: require manual adjustment for new query types
  • Suboptimal: can’t model complex non-linear feature interactions
  • Domain-specific: no transfer across product categories or query types

LTR learns the ranking function from data, capturing complex signals (TF-IDF × freshness × popularity × semantic similarity) that are impossible to tune manually.

LTR Formulations

Pointwise

Score each document independently. Treat as regression/classification.

  • Loss: MSE on relevance grade
  • Simple but ignores relative ordering

Pairwise

For each query, learn that doc A should rank above doc B.

  • Input: (query, doc_A, doc_B, which_is_better)
  • Loss: hinge loss / logistic loss on pairwise preferences
  • RankNet (2005): foundational pairwise approach

Listwise

Optimize a ranking metric directly (NDCG, MAP) over the full ranked list.

  • Loss: gradient of NDCG (tricky — NDCG is non-differentiable)
  • LambdaMART (2010): dominant practical algorithm
  • Uses NDCG gain as gradient weighting

LambdaMART

LambdaMART is the dominant LTR algorithm in production:

  1. MART (Multiple Additive Regression Trees) = gradient boosted decision trees (GBDTs)
  2. Lambda = gradient weighting by NDCG improvement of swapping two documents
from lightgbm import LGBMRanker
 
model = LGBMRanker(
    objective="lambdarank",
    metric="ndcg",
    ndcg_eval_at=[5, 10],
    num_leaves=31,
    n_estimators=100
)
model.fit(X_train, y_train, group=query_sizes)

Features used:

  • BM25 score (title, body, description)
  • Semantic similarity (embedding cosine)
  • Popularity (click rate, sales)
  • Freshness (recency)
  • Price, rating
  • User features (if personalized)

Two-Phase: Retrieval → Ranking

LTR is almost always the second stage in a Retrieval Pipeline:

Fast first-stage (BM25 or bi-encoder) → top-1000
    ↓
LTR reranker (LambdaMART or neural) → top-20

The first stage uses simple fast signals; LTR adds expensive-to-compute features.

Neural LTR

More recent: neural rerankers replace LambdaMART:

  • Cross-Encoder: joint query-document encoding, listwise softmax loss
  • MonoT5: T5-based pointwise ranker
  • RankLLaMA: LLaMA fine-tuned for ranking

Training Data

LTR requires labeled data:

  • Explicit judgments: Judgment Lists with annotated (query, doc, grade) triples
  • Click data: position-debiased click labels from query logs (Click Signals)
  • Conversion data: purchase signals for e-commerce

Position Bias in LTR Training

Click data has position bias — documents at rank 1 get many clicks regardless of quality. Must correct with:

  • IPS (Inverse Propensity Scoring): weight clicks by inverse position probability
  • Counterfactual evaluation: compare click rates controlling for position

Unbiased training data → significantly better LTR models.

Articles

People

  • Doug Turnbull — “How LambdaMART works” post; LTR in production