How to Choose the Best Model for Semantic Search

Source: https://www.meilisearch.com/blog/choosing-the-best-model-for-semantic-search
Publisher: Meilisearch

Summary

A practical model selection guide from the Meilisearch team covering the dimensions that matter when choosing an embedding model: quality, speed, cost, language support, and domain fit.

Key Selection Dimensions

1. Quality (MTEB Benchmark)

The Massive Text Embedding Benchmark (MTEB) evaluates models across 56 tasks:

Retrieval (BEIR subset)
Classification
Clustering
Semantic textual similarity (STS)
Reranking
Summarization

Top performers (as of 2024):

Model	MTEB Score	Dims	Notes
text-embedding-3-large	64.6	3072	OpenAI, paid
e5-mistral-7b-instruct	66.6	4096	Large, slow
nomic-embed-text-v1.5	62.4	768	Open source, MRL
all-MiniLM-L6-v2	56.3	384	Fast, small

2. Speed

Inference speed depends on model size and dimensions:

Small models (<100M params): <10ms per query
Medium (100M–500M): 10–50ms
Large (1B+): 50ms+ without GPU batching

For real-time search, prefer small-medium models or use Matryoshka Embeddings for adaptive speed-quality tradeoff.

3. Cost

Hosted APIs (OpenAI, Cohere): per-token pricing at index time + query time
Self-hosted (sentence-transformers): compute cost only
Hybrid: hosted API for cold-start, self-hosted after migration

For high-volume production, self-hosted typically wins on cost past ~10M tokens/month.

4. Language Support

English-only: most base models
Multilingual: multilingual-e5-large, paraphrase-multilingual-MiniLM
Evaluated on MIRACL for non-English retrieval

5. Domain Fit

General-purpose models may underperform on specialized domains. Options:

Use domain-adapted model (e.g., medis-bert for medical)
Fine-tune general model on domain data (Embedding Fine-tuning)
Use instruction-tuned model with domain-specific prompt (Task-Aware Embeddings)

Decision Flowchart

Is latency critical?
  ├─ Yes → small model (MiniLM) or Matryoshka truncated
  └─ No → check quality requirements
         ├─ General knowledge → nomic-embed or text-embedding-3-small
         └─ Specialized domain → fine-tune or domain model
                    ├─ Have labeled data? → fine-tune bi-encoder
                    └─ No labeled data? → instruction-tuned (E5, Instructor)

Meilisearch Integration

Meilisearch supports multiple embedding backends:

{
  "embedders": {
    "default": {
      "source": "huggingFace",
      "model": "BAAI/bge-base-en-v1.5"
    }
  }
}

Documents are auto-embedded at index time; queries embedded at search time.

Introduction to Matryoshka Embedding Models — flexible dimension selection
Fine-Tuning Text Embeddings For Domain-Specific Search — domain adaptation
Nearest Neighbor Indexes for Similarity Search — ANN infrastructure
Symmetric vs. Asymmetric Semantic Search — task-specific model selection

Semantic Search — primary topic
Bi-Encoder — architecture being selected
Dense Vector Retrieval — infrastructure
Matryoshka Embeddings — speed-quality tradeoff
Embedding Fine-tuning — domain adaptation path
Task-Aware Embeddings — instruction-based adaptation
Asymmetric Semantic Search — common retrieval task

Awesome Search KG

Explorer

How to Choose the Best Model for Semantic Search

How to Choose the Best Model for Semantic Search

Summary

Key Selection Dimensions

1. Quality (MTEB Benchmark)

2. Speed

3. Cost

4. Language Support

5. Domain Fit

Decision Flowchart

Meilisearch Integration

Graph View

Table of Contents

Backlinks

Awesome Search KG

Explorer

How to Choose the Best Model for Semantic Search

How to Choose the Best Model for Semantic Search

Summary

Key Selection Dimensions

1. Quality (MTEB Benchmark)

2. Speed

3. Cost

4. Language Support

5. Domain Fit

Decision Flowchart

Meilisearch Integration

Related Articles

Related Concepts

Graph View

Table of Contents

Backlinks