BlockMax WAND: How Weaviate Achieved 10x Faster Keyword Search

Authors: André Mourão, Joon-Pil (JP) Hwang (Weaviate) Published: 2025-02-26

Weaviate implemented Block-Max WAND as a technical preview in v1.29, achieving p50 query speedups of 5–10x and index size reductions of 50–90% over their existing WAND-based keyword search.

Problem

As text corpora grow, keyword search latency increases relative to vector search. Even with WAND already reducing documents inspected from 100% to 15–30%, there was significant headroom for further optimization. Keyword search is a core leg of Hybrid Search for RAG and agentic AI.

Key Concepts

Inverted Index and Tokenization

Documents are tokenized (whitespace, lowercase, etc.) and stored in posting lists mapping term → (doc_id, tf). Exhaustive scoring requires visiting every document with at least one query term — wasteful for top-k queries that only need 10–100 results.

WAND (existing approach)

WAND uses global per-term IDF upper bounds to skip documents that cannot beat the current k-th score. Reduces documents inspected from 100% to 15–30% on standard benchmarks.

Block-Max WAND

Block-Max WAND (BMW) further refines this by dividing posting lists into fixed blocks (128 docs/block in Weaviate), each with local max impact metadata:

Shallow advances: skip entire blocks without decoding doc IDs, using block-level max impact
Deep block skipping: avoid loading blocks from disk entirely when they cannot beat the threshold
Reduces documents inspected from 15–30% (WAND) to 5–15% — roughly halving WAND’s work

Dataset	WAND % scored	BMW % scored	Reduction
MS Marco (8.6M docs)	15.1%	6.7%	-56%
Fever (5.4M docs)	20.8%	8.4%	-60%
Climate Fever (5.4M docs)	29.3%	12.2%	-58%

Compression

Block storage also enables aggressive compression:

varenc (variable-length encoding): store only the minimum bits needed to represent the max value in each block.

Delta encoding (doc IDs): store differences between consecutive doc IDs. Consecutive IDs in a sorted posting list are close together, producing small deltas that compress well.

Combined result: 50–90% reduction in index size depending on data distribution.

Dataset	WAND index	BMW index	Reduction
MS Marco (8.6M docs)	10531 MB	941 MB	-91%
Fever/Climate Fever (5.4M docs)	9326 MB	1175 MB	-87%

Query Time Results

Dataset	WAND p50	BMW p50	Reduction
MS Marco (8.6M docs)	136 ms	27 ms	-80%
Fever (5.4M docs)	517 ms	33 ms	-94%
Climate Fever (5.4M docs)	712 ms	87 ms	-88%

At 100M documents: scaled from 1 to 50 QPS while maintaining p50 100–200ms, p99 ≤1000ms.

Status and Availability

Available in Weaviate v1.29 as a technical preview. Requires:

New collections (no transparent migration for existing data yet)
Specific env vars enabled

Not recommended for production environments. Weaviate is developing a transparent migration path.

Awesome Search KG

Explorer

BlockMax WAND: How Weaviate Achieved 10x Faster Keyword Search

BlockMax WAND: How Weaviate Achieved 10x Faster Keyword Search

Problem

Key Concepts

Inverted Index and Tokenization

WAND (existing approach)

Block-Max WAND

Compression

Query Time Results

Status and Availability

People

Graph View

Table of Contents

Backlinks

Awesome Search KG

Explorer

BlockMax WAND: How Weaviate Achieved 10x Faster Keyword Search

BlockMax WAND: How Weaviate Achieved 10x Faster Keyword Search

Problem

Key Concepts

Inverted Index and Tokenization

WAND (existing approach)

Block-Max WAND

Compression

Query Time Results

Status and Availability

Related Concepts

People

Graph View

Table of Contents

Backlinks