Dense Vector Retrieval

Definition

Dense vector retrieval uses dense numerical embeddings (produced by neural models) to represent queries and documents, then finds the most similar documents via approximate nearest neighbor (ANN) search. Unlike sparse retrieval (keyword matching), it captures semantic relationships.

How It Works

Documents → Encoder → dense vectors → ANN Index ([[HNSW]]/[[IVF]]/...)

Query → Encoder → query vector → ANN search → top-k similar docs

Index Types (FAISS / ANN)

IndexSpeedRecallMemoryBest For
Flat (brute force)Slowest100%~500MB/1MSmall datasets
HNSWFastest95%+600-1600MBQuality-focused
IVFFast70-95%~520MBBalanced, scalable
LSHVariable40-85%20-600MBLow-dimensional

HNSW (Hierarchical Navigable Small World) is the most widely used:

  • Graph-based multi-layer structure
  • Key params: M (connections), efSearch, efConstruction

Key Models Producing Dense Vectors

The Filtering Problem

Standard ANN indexes don’t support metadata filters efficiently:

  • Pre-filter + brute-force: Accurate but slow
  • Post-filter: Fast but may return too few results
  • Single-stage (Pinecone): Merges metadata + vector index — best of both

See: Vector Filtering

Symmetric vs. Asymmetric Retrieval

  • Symmetric — query and document are similar length/type (e.g., duplicate question detection)
  • Asymmetric — short query retrieves long documents (e.g., question → Wikipedia passage)

See: Asymmetric Semantic Search

Articles