Listing Embeddings in Search Ranking

Source: https://medium.com/airbnb-engineering/listing-embeddings-for-similar-listing-recommendations-and-real-time-personalization-in-search-601172f7603e Authors: Mihajlo Grbovic, Haibin Cheng, Qing Zhang, Lynn Yang, Phillippe Siclait, Matt Jones

Summary

Airbnb trained 32-dimensional listing embeddings from 800M+ search click sessions to enable real-time Personalization in search ranking. The embeddings implicitly encode location, price, listing type, architecture, and aesthetic qualities.

Training Approach

Adapted word2vec negative sampling to listing click sessions:

  • Session = uninterrupted sequence of clicked listing IDs (30+ minute gap = new session)
  • 4.5M active listings
  • 800M+ click sessions
  • Key modification: booked listings treated as global context across all sessions
  • Market-specific negative sampling to address geographic concentration bias

Model Architecture

  • 32-dimensional dense vectors per listing
  • Captures “location, price, listing type, architecture” implicitly from user behavior

Integration with Search Ranking

Two real-time similarity features fed into the main ML ranking model:

  • EmbClickSim: cosine similarity between candidate listing and user’s recently clicked listings
  • EmbSkipSim: cosine similarity between candidate and listings user explicitly skipped

Results

  • Similar Listings carousel: +21% CTR, +4.9% more bookings discovered
  • Real-time personalization in search ranking: launched successfully (2017)