LTR Feature Engineering

Why features matter

LTR models are only as good as the signals fed into them. The model learns which features predict relevance, but it can only work with what you give it. Feature engineering is where most of the practical leverage lies — a well-chosen 20-feature set often outperforms a 200-feature model built on weak signals.

A feature is useful if it helps answer one of three questions:

  1. Is this product relevant to the query? (query × document interaction)
  2. Does this user tend to prefer products like this? (user × document personalization)
  3. Is this product generally successful/popular? (document-level popularity/quality)

The most valuable features are usually interaction features between query × product or user × product, not raw product attributes alone.


Feature taxonomy

1. Document/product features

Raw attributes of the item being ranked. Useful as inputs to interaction and personalization features; less useful on their own.

FeatureExample valueValue aloneBest use
brandNikeMediumbrand_match, user_brand_affinity
categoryRunning ShoesMediumcategory_match, user_category_affinity
genderWomenMediumgender_match, user_gender_affinity
price_bucket100–150Mediumuser_price_affinity, price competitiveness
in_stocktrueUsefulDirectly rankable
rating_bucket4.5+UsefulDirectly rankable
colorRedLow–mediumcolor_match, user_color_affinity
materialLeatherLowmaterial_match in niche domains
sizeXLLowOnly useful if not pre-filtered (see below)
weight500gVery lowDomain-specific

Note on size: many pipelines filter by size before ranking, so every candidate is the right size — the feature then carries zero discriminating information. Only add size if your retrieval stage does not pre-filter.


2. Query × product interaction features

These are the highest-value features. They capture how well the product matches what the user typed.

FeatureExample conditionStrength
brand_matchquery = “nike shoes” AND product.brand = NikeVery strong
category_matchquery = “running shoes” AND product.category = Running ShoesVery strong
color_matchquery = “red shoes” AND product.color = RedStrong when color appears in queries
gender_matchquery = “women’s running shoes” AND product.gender = WomenStrong in apparel
material_matchquery = “leather boots” AND product.material = LeatherDomain-specific
bm25_titleBM25 of query against product titleStrong
bm25_descriptionBM25 of query against descriptionMedium
semantic_similarityCosine between query embedding and product embeddingStrong

These features require parsing the query and comparing against product attributes — not just passing the attribute value.


3. User × product personalization features

Capture whether this user specifically tends to prefer items like this.

FeatureSignal sourceStrength
user_brand_affinityUser click/purchase history on brandVery strong
user_category_affinityUser history in categoryVery strong
user_price_affinityUser’s historical price rangeStrong
user_color_affinityUser’s repeated color selectionsMedium (fashion-heavy)
user_material_affinityUser’s material preferencesDomain-specific

These require user history storage and real-time lookup — typically served from a Feature Store (e.g. Redis) rather than the search index.


4. Behavioral / popularity features

Global signals about how the product performs across all users.

FeatureMeaningStrength
ctrClick-through rateStrong
purchase_rateConversion rateVery strong
popularityRaw impression/click countStrong
freshnessRecency of product / last updateDomain-specific

These are the easiest to add and often have a strong baseline effect. A product with high purchase rate ranks well even without personalization.


Suggested minimal feature set

If starting from scratch, a practical first version:

Document features:

  • brand, category, gender, price_bucket, in_stock, rating_bucket

Interaction features:

  • bm25_title, bm25_description, semantic_similarity
  • brand_match, category_match

Behavioral features:

  • ctr, purchase_rate, popularity

Personalization features (second iteration):

  • user_brand_affinity, user_category_affinity, user_price_affinity

This set covers the three core questions and captures most of the reachable value before introducing exotic signals.


Why feature engineering is critical for LTR

  • Garbage in, garbage out — a perfect LambdaMART model on weak features underperforms a simpler linear model on strong ones.
  • Feature orthogonality matters — redundant features (brand + brand_with_spaces) waste model capacity; distinct signals (query_match vs user_affinity) compound.
  • Interaction features dominate — a raw attribute like color = Red tells the ranker nothing. color_match = true (query contains “red” AND product.color = Red) is a strong signal. The model cannot learn this interaction if it only sees the raw value.
  • Behavioral signals generalize — CTR and purchase rate work across all query types without any query parsing.
  • Personalization signals are multiplicative — combining user affinity with query match is more powerful than either alone.

The rule of thumb: invest in interaction features first, then personalization, then exotic product attributes.


Metarank feature types

Metarank categorizes features into:

  • item features — static product attributes (from item metadata events)
  • user features — aggregated behavioral signals per user
  • interaction features — query × item or user × item combinations
  • global features — popularity/CTR across all users

Metarank’s user_session_weight aggregator builds user affinity profiles in real time from click events, enabling user_brand_affinity-style features without offline batch pipelines.


  • Learning to Rank — the model that consumes these features
  • LambdaMART — dominant algorithm; GBDTs handle non-linear feature interactions automatically
  • Feature Store — infrastructure for persisting and serving features at query time
  • Personalization — user × product features and affinity modeling
  • Click Signals — source of behavioral features (CTR, purchase rate)
  • Implicit Judgments — click data as training labels
  • Metarank — open-source LTR re-ranker with built-in feature types

Articles