The Missing WHERE Clause in Vector Search

Source: https://www.pinecone.io/learn/vector-search-filtering/
Author: James Briggs (Pinecone)

Summary

James Briggs explains why metadata filtering in vector search is hard, why naive pre-filter and post-filter approaches fail at the extremes, and how Pinecone’s single-stage approach solves the problem.

The SQL Analogy

SQL has a natural WHERE clause:

SELECT * FROM products 
WHERE category = 'shoes' AND price < 100
ORDER BY relevance DESC
LIMIT 10;

Vector search should have the equivalent:

search(query_vector, filter={"category": "shoes", "price__lt": 100}, top_k=10)

But this is technically non-trivial because ANN indexes don’t natively support arbitrary predicates.

Why Pre-filtering Fails

Pre-filter: apply metadata filter → run ANN on filtered subset.

Problem at high selectivity (e.g., filter matches 0.1% of corpus):

  • ANN indexes need a minimum corpus size to work well
  • With only 0.1% of docs, the “index” is just a few thousand vectors
  • Quality degrades to near-exact search anyway
  • Index structures built for the full corpus don’t apply

Why Post-filtering Fails

Post-filter: run ANN on full corpus → apply metadata filter to top-K.

Problem at high selectivity:

Query: find shoes under $10 (0.1% of catalog)
ANN returns top-1000 by vector similarity
Post-filter: only 1 of 1000 matches "price < $10"
Effective recall: catastrophic

Need to over-retrieve 100–1000x to compensate → kills latency.

Pinecone’s Single-Stage Solution

The key insight: during HNSW graph traversal, apply the metadata filter inline:

# Conceptually: filter is a pruning function on graph traversal
def hnsw_filtered_search(query, filter_fn, k):
    candidates = []
    for node in traverse_hnsw_graph(query):
        if filter_fn(node.metadata):  # apply filter during traversal
            candidates.append(node)
            if len(candidates) >= k:
                break
    return candidates

By integrating the filter into the traversal, the system:

  1. Never retrieves docs that fail the filter
  2. Continues traversal until k matching docs are found
  3. Maintains recall even at high selectivity

Metadata Storage Architecture

# Pinecone upsert with metadata
index.upsert([{
    "id": "product_123",
    "values": embedding_vector,
    "metadata": {
        "category": "shoes",
        "price": 89.99,
        "brand": "Nike",
        "in_stock": True
    }
}])
 
# Filtered query
results = index.query(
    vector=query_embedding,
    filter={"category": "shoes", "price": {"$lt": 100}},
    top_k=10
)

Performance Characteristics

Filter SelectivityPre-filterPost-filterSingle-stage (Pinecone)
50%GoodGoodGood
10%DegradedGoodGood
1%PoorDegradedGood
0.1%BrokenBrokenGood

Single-stage maintains consistent quality across all selectivity levels.

People