Improving Information Retrieval in the Elastic Stack: Introducing ELSER

Source: https://www.elastic.co/search-labs/blog/elastic-learned-sparse-encoder-elser-retrieval-performance
Publisher: Elastic

Summary

Elastic announces ELSER (Elastic Learned Sparse Encoder), their production-grade sparse retrieval model built on SPLADE principles. Reports 17% NDCG@10 improvement over BM25 and describes the distillation-based training approach.

What is ELSER?

ELSER is Elastic’s version of learned sparse retrieval, optimized for the Elasticsearch inverted index infrastructure:

Based on SPLADE architecture
100M parameter BERT-based model
Produces vocabulary-space sparse vectors (30,522 BERT vocabulary dimensions)
Deployed natively in Elasticsearch/OpenSearch

Training Methodology: Knowledge Distillation

ELSER was trained using a powerful ensemble teacher:

Teacher: MiniLM (fast bi-encoder) + MonoT5-3B (large cross-encoder reranker)
    ↓
Generate high-quality labels for MS MARCO + domain data
    ↓
Student: 100M parameter BERT MLM model
    ↓
ELSER: compressed, fast sparse encoder

The teacher ensemble captures complementary signals:

MiniLM: semantic similarity
MonoT5-3B: nuanced relevance judgment

This is an example of Retrieval Pipeline distillation — compressing a multi-stage pipeline into a single model.

Performance Results

On MS MARCO dev set (NDCG@10):

Model	NDCG@10	Notes
BM25	0.228	Lexical baseline
ELSER v1	0.267	+17% over BM25
Dense bi-encoder	0.253	Semantic only
Hybrid BM25+Dense	0.271	Best combination

ELSER outperforms both BM25 and standard dense bi-encoders individually, approaching hybrid quality with a single model.

ELSER v2 Improvements

ELSER v2 brings:

Better OOD (out-of-distribution) performance
Faster inference
Lower memory footprint
Improved handling of numeric/code content

Integration in Elasticsearch

# Elasticsearch 8.x API
GET /my_index/_search
{
  "query": {
    "text_expansion": {
      "ml.tokens": {
        "model_id": ".elser_model_2_linux-x86_64",
        "model_text": "find documents about machine learning"
      }
    }
  }
}

Elasticsearch stores ELSER sparse vectors natively in its inverted index — no separate vector database required.

ELSER as Hybrid Component

For best results, Elastic recommends ELSER + BM25 hybrid:

{
  "query": {
    "bool": {
      "should": [
        {"match": {"content": "query"}},  // BM25
        {"text_expansion": {"ml.tokens": {"model_id": ".elser_model_2", "model_text": "query"}}}  // ELSER
      ]
    }
  }
}

SPLADE for Sparse Vector Search Explained — SPLADE foundations
SPLADE - Sparse Bi-Encoder BERT Model for First-Stage Ranking — original research
Hybrid Search SPLADE Sparse Encoder — hybrid implementation
Distilling Retrieval Pipelines to a Single Embedding Model — related distillation concept

ELSER — primary topic
SPLADE — architecture basis
Sparse Vector Retrieval — category
Retrieval Pipeline — distillation training
Hybrid Search — recommended deployment
Embedding Fine-tuning — distillation is a form of training

Awesome Search KG

Explorer

Improving Information Retrieval in the Elastic Stack: Introducing ELSER

Improving Information Retrieval in the Elastic Stack: Introducing ELSER

Summary

What is ELSER?

Training Methodology: Knowledge Distillation

Performance Results

ELSER v2 Improvements

Integration in Elasticsearch

ELSER as Hybrid Component

Graph View

Table of Contents

Backlinks

Awesome Search KG

Explorer

Improving Information Retrieval in the Elastic Stack: Introducing ELSER

Improving Information Retrieval in the Elastic Stack: Introducing ELSER

Summary

What is ELSER?

Training Methodology: Knowledge Distillation

Performance Results

ELSER v2 Improvements

Integration in Elasticsearch

ELSER as Hybrid Component

Related Articles

Related Concepts

Graph View

Table of Contents

Backlinks