pgvector

Open-source PostgreSQL extension adding a vector column type and approximate-nearest-neighbor indexing for vector search inside Postgres.

Capabilities

  • Index types:
    • HNSW — recommended default; high recall, low latency. Built incrementally on insert, so no upfront clustering step (works on an empty table).
    • IVFFlat — smaller index, faster build, lower recall. Needs a one-time k-means clustering pass at CREATE INDEX to compute its lists cell centroids from the rows already present — load data first, and REINDEX if the distribution shifts. (“Training” here = unsupervised clustering, not model training.)
  • Distance ops: cosine (<=>), L2 (<->), inner product (<#>).
  • Types: vector, halfvec (half-precision), sparsevec (sparse), binary vector indexing (see Binary Quantization).
ALTER TABLE products ADD COLUMN embedding vector(1536);
CREATE INDEX ON products USING hnsw (embedding vector_cosine_ops);
SELECT * FROM products ORDER BY embedding <=> :q LIMIT 20;

Role

The semantic-retrieval leg of a PostgreSQL Hybrid Search stack; combined with native FTS or ParadeDB BM25 and fused via RRF.