Nearest Neighbor Indexes for Similarity Search

Source: https://www.pinecone.io/learn/series/faiss/vector-indexes/
Author: James Briggs (Pinecone)

Summary

James Briggs explains the major ANN (Approximate Nearest Neighbor) index types available in FAISS, their tradeoffs, and guidance on choosing the right index for different scale and quality requirements.

The ANN Problem

Exact nearest neighbor search over millions of vectors is O(n × d) — too slow for real-time queries. ANN algorithms trade a small accuracy loss for massive speedups.

Index Types

import faiss
index = faiss.IndexFlatL2(dimension)  # exact L2 search
index.add(vectors)
distances, indices = index.search(query, k=10)
  • Accuracy: 100% (exact)
  • Speed: Slowest — O(n)
  • Use when: Small dataset (<100K), or as quality baseline

IVF (Inverted File Index)

quantizer = faiss.IndexFlatL2(d)
index = faiss.IndexIVFFlat(quantizer, d, nlist=100)  # 100 clusters
index.train(vectors)
index.add(vectors)
index.nprobe = 10  # search 10 clusters at query time
  • Accuracy: ~95–99% (tunable via nprobe)
  • Speed: Fast — searches only nprobe clusters
  • Use when: Medium dataset (100K–10M), need GPU support

HNSW (Hierarchical Navigable Small World)

index = faiss.IndexHNSWFlat(d, M=16)  # M = connections per node
index.hnsw.efConstruction = 40
index.add(vectors)
index.hnsw.efSearch = 16
  • Accuracy: 95–99% (tunable via efSearch)
  • Speed: Very fast — graph traversal
  • Memory: Higher than IVF (stores graph structure)
  • Use when: Low latency requirement, CPU-only, no batch training needed

IVF + PQ (Product Quantization)

index = faiss.IndexIVFPQ(quantizer, d, nlist=100, m=8, nbits=8)

Compresses vectors from float32 to quantized codes:

  • 4–8x memory reduction
  • Small accuracy loss
  • Use when: Very large dataset (>100M), memory-constrained

LSH (Locality Sensitive Hashing)

Hashes similar vectors to the same buckets:

  • Very fast but lower accuracy
  • Largely superseded by HNSW in practice

Index Selection Guide

Dataset SizeQuality PriorityRecommended Index
<100KExactFlat
100K–1MHighHNSW
1M–100MBalancedIVF-Flat
>100MMemory constrainedIVF-PQ
GPU availableSpeedIVF (GPU)

Pinecone’s Managed Index

Pinecone handles index selection automatically and adds:

  • Native Vector Filtering (metadata predicates alongside ANN)
  • Automatic scaling
  • Real-time upsert/delete

Quality Metrics

Recall@k: fraction of true top-k results returned by ANN.
HNSW with efSearch=64 typically achieves Recall@10 > 0.99 at good throughput.

People