How to Choose the Best Model for Semantic Search
Source: https://www.meilisearch.com/blog/choosing-the-best-model-for-semantic-search
Publisher: Meilisearch
Summary
A practical model selection guide from the Meilisearch team covering the dimensions that matter when choosing an embedding model: quality, speed, cost, language support, and domain fit.
Key Selection Dimensions
1. Quality (MTEB Benchmark)
The Massive Text Embedding Benchmark (MTEB) evaluates models across 56 tasks:
- Retrieval (BEIR subset)
- Classification
- Clustering
- Semantic textual similarity (STS)
- Reranking
- Summarization
Top performers (as of 2024):
| Model | MTEB Score | Dims | Notes |
|---|---|---|---|
| text-embedding-3-large | 64.6 | 3072 | OpenAI, paid |
| e5-mistral-7b-instruct | 66.6 | 4096 | Large, slow |
| nomic-embed-text-v1.5 | 62.4 | 768 | Open source, MRL |
| all-MiniLM-L6-v2 | 56.3 | 384 | Fast, small |
2. Speed
Inference speed depends on model size and dimensions:
- Small models (<100M params): <10ms per query
- Medium (100M–500M): 10–50ms
- Large (1B+): 50ms+ without GPU batching
For real-time search, prefer small-medium models or use Matryoshka Embeddings for adaptive speed-quality tradeoff.
3. Cost
- Hosted APIs (OpenAI, Cohere): per-token pricing at index time + query time
- Self-hosted (sentence-transformers): compute cost only
- Hybrid: hosted API for cold-start, self-hosted after migration
For high-volume production, self-hosted typically wins on cost past ~10M tokens/month.
4. Language Support
- English-only: most base models
- Multilingual:
multilingual-e5-large,paraphrase-multilingual-MiniLM - Evaluated on MIRACL for non-English retrieval
5. Domain Fit
General-purpose models may underperform on specialized domains. Options:
- Use domain-adapted model (e.g.,
medis-bertfor medical) - Fine-tune general model on domain data (Embedding Fine-tuning)
- Use instruction-tuned model with domain-specific prompt (Task-Aware Embeddings)
Decision Flowchart
Is latency critical?
├─ Yes → small model (MiniLM) or Matryoshka truncated
└─ No → check quality requirements
├─ General knowledge → nomic-embed or text-embedding-3-small
└─ Specialized domain → fine-tune or domain model
├─ Have labeled data? → fine-tune bi-encoder
└─ No labeled data? → instruction-tuned (E5, Instructor)
Meilisearch Integration
Meilisearch supports multiple embedding backends:
{
"embedders": {
"default": {
"source": "huggingFace",
"model": "BAAI/bge-base-en-v1.5"
}
}
}Documents are auto-embedded at index time; queries embedded at search time.
Related Articles
- Introduction to Matryoshka Embedding Models — flexible dimension selection
- Fine-Tuning Text Embeddings For Domain-Specific Search — domain adaptation
- Nearest Neighbor Indexes for Similarity Search — ANN infrastructure
- Symmetric vs. Asymmetric Semantic Search — task-specific model selection
Related Concepts
- Semantic Search — primary topic
- Bi-Encoder — architecture being selected
- Dense Vector Retrieval — infrastructure
- Matryoshka Embeddings — speed-quality tradeoff
- Embedding Fine-tuning — domain adaptation path
- Task-Aware Embeddings — instruction-based adaptation
- Asymmetric Semantic Search — common retrieval task