Semantic IDs

A semantic ID is a short sequence of discrete codes — derived from an item’s or document’s content embedding — that replaces the arbitrary atomic identifier (e.g. User 5489, doc_001923) traditionally used in retrieval and recommendation. Because the codes are learned from content, semantically similar items get similar ID prefixes, so the identifier itself carries meaning.

The codes are produced by quantizing a content embedding through RQ-VAE (residual-quantized vector quantization), giving a hierarchical, coarse-to-fine token sequence.

Inception     →  [12, 153, 87, 21]
Interstellar  →  [12, 153, 87, 99]   ← shared prefix = shared semantics (sci-fi / Nolan)

Why Semantic IDs Matter

Atomic, randomly-assigned IDs are opaque: ID 5489 tells a model nothing about ID 5490. This causes well-known failures:

Cold start — a brand-new item has a never-before-seen ID and no learned embedding; a semantic ID is computable from its content on day one.
Long-tail / sparsity — rare items share code prefixes with popular neighbours, so signal transfers to the tail.
Vocabulary explosion — a corpus of N items needs an N-way softmax over atomic IDs; a semantic ID factorizes this into a few small codebook predictions (e.g. 3 × 256 instead of 1 × 33k).
Generalization — content-derived codes transfer across datasets and time in a way memorized atomic embeddings do not.

How They Are Built

Encode item content (title, brand, category, text, image) into a dense embedding — e.g. a sentence-transformer 768-d vector.
Quantize the embedding with RQ-VAE: each residual quantization level emits one codebook index.
Concatenate the per-level indices → the semantic ID (e.g. 3 levels × 256-entry codebooks → a 3-token ID).

The result is a tokenization of the corpus: every item becomes a short string over a small, learned vocabulary.

Use in Retrieval — the Generative Angle

Semantic IDs are the identifier scheme that makes Generative Retrieval practical. Instead of dense-retrieve-then-rank, a sequence model generates the target ID token-by-token:

Recommendation: a seq2seq Transformer reads a user’s session (a sequence of item semantic IDs) and generates the next item’s semantic ID. This is the TIGER formulation (Rajput et al., 2023, arXiv:2305.05065).
Document retrieval: the same idea predates recsys in Differentiable Search Index (DSI), where a model maps a query directly to a document’s (semantically-structured) identifier.

Because IDs share prefixes, beam search over codebook tokens naturally explores a neighbourhood of related items — generalization is built into the decoding.

Semantic IDs have also been applied to the ranking stage at YouTube (arXiv:2306.08121).

Connection to Quantization

Semantic IDs reuse the machinery of Vector Quantization — but with a different goal. Classic VQ/PQ compress vectors to save memory for ANN search; semantic IDs use RQ-VAE to produce generatable, hierarchical tokens that a model can predict. Same tools (codebooks, residual quantization), different objective.

RQ-VAE — the residual-quantization autoencoder that produces the codes
Generative Retrieval — generating identifiers instead of scoring candidates
Differentiable Search Index — the IR-native generative-retrieval precursor
TIGER — generative recommendation over semantic IDs
Vector Quantization — shared codebook/quantization machinery, compression-oriented
Embeddings — the content representation that semantic IDs are derived from
Dense Embeddings — the encoder output (e.g. T5) fed into RQ-VAE

Articles

Semantic IDs for Recommendation Systems — Janu Verma; hands-on explainer and Amazon Beauty experiment

People

Janu Verma — explainer and reference implementation

Awesome Search KG

Explorer

Semantic IDs

Semantic IDs

Why Semantic IDs Matter

How They Are Built

Use in Retrieval — the Generative Angle

Connection to Quantization

Articles

People

Graph View

Table of Contents

Backlinks

Awesome Search KG

Explorer

Semantic IDs

Semantic IDs

Why Semantic IDs Matter

How They Are Built

Use in Retrieval — the Generative Angle

Connection to Quantization

Related Concepts

Articles

People

Graph View

Table of Contents

Backlinks