BBQ — Better Binary Quantization

Elasticsearch’s production approach to aggressive vector compression, combining binary quantization with techniques that recover precision. Part of Elasticsearch’s broader Vector Quantization story.

The Core Idea

Naive binary quantization (sign of each dimension → 1 bit) is fast but lossy. BBQ adds:

Centroid centering — subtract the mean vector from all stored vectors before quantizing; reduces distortion from asymmetric distributions
Rescoring — BBQ candidates are reranked with full-precision dot products, recovering most of the lost recall

Net result: 32× memory reduction with recall competitive with float32.

OSQ — Optimized Scalar Quantization

Elastic’s companion technique: maps float32 dimensions to a uniform integer grid, enabling integer SIMD arithmetic.

10–40× faster than float32 HNSW traversal on CPU hardware
Uniform grid = every comparison becomes integer arithmetic (no float division)
Uses anisotropic loss — weights high-magnitude dimensions more heavily when choosing quantization boundaries, since they contribute more to dot product
CPU-friendly: leverages AVX2/AVX-512 SIMD without GPU requirement

OSQ vs TurboQuant

TurboQuant (Google Research, Zandieh et al. 2026) uses a rotation + Lloyd-Max codebook approach. OSQ uses a uniform integer grid — simpler, SIMD-native.

Elastic’s benchmark (Thomas Veasey): OSQ competitive with TurboQuant at practical recall targets; OSQ wins on CPU throughput
Qdrant’s benchmark (10 datasets): TurboQuant 4-bit (8×) beats SQ on 3/10 datasets by up to 4.6 pp; TurboQuant 2-bit beats BQ by 9–24 pp across all datasets

Elasticsearch BBQ in Practice

- Index type: dense_vector with element_type: byte or bit
- bbq_hnsw: Elasticsearch's binary quantized HNSW index
- bbq_flat: flat scan with binary quantization
- Rescoring: num_candidates >> k, then exact rescore

Vector Quantization — parent concept; BBQ is one strategy
Binary Quantization — the underlying 1-bit quantization BBQ extends
Scalar Quantization — OSQ is Elastic’s int8 companion technique
Dense Vector Retrieval — where BBQ is applied
HNSW — the index BBQ accelerates
Approximate Nearest Neighbor Search
GGUF — quantization for LLM weights (different domain)

Articles

Elasticsearch BBQ Optimized Scalar Quantization vs TurboQuant — Thomas Veasey; Elastic’s CPU benchmarks comparing OSQ and TurboQuant
TurboQuant in Qdrant — Ivan Pleshkov & Jonas Schulz; Qdrant 1.18 implementation with anisotropy compensation and RaBitQ extensions

People

Thomas Veasey — Elastic principal engineer; BBQ and OSQ designer

Awesome Search KG

Explorer

BBQ

BBQ — Better Binary Quantization

The Core Idea

OSQ — Optimized Scalar Quantization

OSQ vs TurboQuant

Elasticsearch BBQ in Practice

Articles

People

Graph View

Table of Contents

Backlinks

Awesome Search KG

Explorer

BBQ

BBQ — Better Binary Quantization

The Core Idea

OSQ — Optimized Scalar Quantization

OSQ vs TurboQuant

Elasticsearch BBQ in Practice

Related Concepts

Articles

People

Graph View

Table of Contents

Backlinks