WANDS Dataset

Overview

WANDS (Wayfair ANnotation Dataset) is a search relevance annotation dataset released by Wayfair. It provides human-annotated query–product relevance judgments for the home goods / furniture domain, designed to support search evaluation and ranking research.

Dataset Structure

  • ~42,000 annotated query–product pairs
  • Queries drawn from real Wayfair search traffic
  • Products: furniture, home décor, and home goods
  • Relevance labels: 3-class (Exact, Partial, Irrelevant)

Label Schema

LabelMeaning
ExactProduct directly satisfies the query intent
PartialProduct is related but doesn’t fully satisfy intent
IrrelevantProduct is not relevant to the query

Domain Characteristics

Home goods search has distinct challenges compared to general e-commerce:

  • Highly visual product categories (color, style, material matter)
  • Queries often blend attribute combinations (“mid-century modern sofa gray”)
  • Long-tail product variants (same item in 20 finishes)
  • Style and taste are subjective — relevance judgments can be noisy

Use Cases

  • Benchmarking lexical vs. semantic retrieval in a vertical domain
  • Training and evaluating Learning to Rank models
  • Studying vertical domain transfer from general datasets like Amazon ESCI Dataset
  • Evaluating Hybrid Search systems in e-commerce contexts

Comparison with Other Datasets

DatasetDomainScaleLabel type
Amazon ESCI DatasetGeneral e-commerceVery large4-class (ESCI)
WANDSHome goods~42K pairs3-class
Home Depot Product Search RelevanceHome improvement~74K pairsContinuous 1–3

Source