Matryoshka embeddings: faster OpenAI vector search using Adaptive Retrieval

OpenAI’s text-embedding-3 models can shorten dimensions via a dimensions API parameter, powered by Matryoshka Representation Learning (MRL) — a training technique that embeds information at multiple granularity levels coarse-to-fine.

How shortening works

Truncate from the end, then re-normalize (the normalization step is critical — truncating a unit vector breaks magnitude=1).

Adaptive Retrieval

Two-pass search exploiting the sub-vector hierarchy:

First pass (fast, low-dim): ANN search at 512d → shortlist of match_count × 8 candidates
Second pass (accurate, high-dim): KNN re-rank at full 3072d

Results (1M DBpedia vectors)

Method	Accuracy	QPS
Single-pass ANN 1536d	89.2%	670
Adaptive Retrieval 512d→3072d	99%	580

pgvector implementation

-- Functional index on 512d sub-vectors
CREATE INDEX ON documents
USING hnsw ((sub_vector(embedding, 512)::vector(512)) vector_ip_ops)
WITH (m = 32, ef_construction = 400);
 
-- Two-pass query: first shortlist at 512d, re-rank at 3072d

Key insights

text-embedding-3-large @ 256d outperforms ada-002 @ 1536d (MTEB 62.0 vs 61.0)
Optimal first-pass: 512d (likely a training granularity)
N-pass extension = Funnel Retrieval (MRL paper)
Shorter first-pass vectors aren’t always faster — lower accuracy requires retrieving more records

Awesome Search KG

Explorer

Matryoshka embeddings: faster OpenAI vector search using Adaptive Retrieval

Matryoshka embeddings: faster OpenAI vector search using Adaptive Retrieval

How shortening works

Adaptive Retrieval

Results (1M DBpedia vectors)

pgvector implementation

Key insights

People

Graph View

Table of Contents

Backlinks

Awesome Search KG

Explorer

Matryoshka embeddings: faster OpenAI vector search using Adaptive Retrieval

Matryoshka embeddings: faster OpenAI vector search using Adaptive Retrieval

How shortening works

Adaptive Retrieval

Results (1M DBpedia vectors)

pgvector implementation

Key insights

Related Concepts

People

Graph View

Table of Contents

Backlinks