How to Choose the Best Model for Semantic Search

Source: https://www.meilisearch.com/blog/choosing-the-best-model-for-semantic-search
Publisher: Meilisearch

Summary

A practical model selection guide from the Meilisearch team covering the dimensions that matter when choosing an embedding model: quality, speed, cost, language support, and domain fit.

Key Selection Dimensions

1. Quality (MTEB Benchmark)

The Massive Text Embedding Benchmark (MTEB) evaluates models across 56 tasks:

  • Retrieval (BEIR subset)
  • Classification
  • Clustering
  • Semantic textual similarity (STS)
  • Reranking
  • Summarization

Top performers (as of 2024):

ModelMTEB ScoreDimsNotes
text-embedding-3-large64.63072OpenAI, paid
e5-mistral-7b-instruct66.64096Large, slow
nomic-embed-text-v1.562.4768Open source, MRL
all-MiniLM-L6-v256.3384Fast, small

2. Speed

Inference speed depends on model size and dimensions:

  • Small models (<100M params): <10ms per query
  • Medium (100M–500M): 10–50ms
  • Large (1B+): 50ms+ without GPU batching

For real-time search, prefer small-medium models or use Matryoshka Embeddings for adaptive speed-quality tradeoff.

3. Cost

  • Hosted APIs (OpenAI, Cohere): per-token pricing at index time + query time
  • Self-hosted (sentence-transformers): compute cost only
  • Hybrid: hosted API for cold-start, self-hosted after migration

For high-volume production, self-hosted typically wins on cost past ~10M tokens/month.

4. Language Support

  • English-only: most base models
  • Multilingual: multilingual-e5-large, paraphrase-multilingual-MiniLM
  • Evaluated on MIRACL for non-English retrieval

5. Domain Fit

General-purpose models may underperform on specialized domains. Options:

Decision Flowchart

Is latency critical?
  ├─ Yes → small model (MiniLM) or Matryoshka truncated
  └─ No → check quality requirements
         ├─ General knowledge → nomic-embed or text-embedding-3-small
         └─ Specialized domain → fine-tune or domain model
                    ├─ Have labeled data? → fine-tune bi-encoder
                    └─ No labeled data? → instruction-tuned (E5, Instructor)

Meilisearch Integration

Meilisearch supports multiple embedding backends:

{
  "embedders": {
    "default": {
      "source": "huggingFace",
      "model": "BAAI/bge-base-en-v1.5"
    }
  }
}

Documents are auto-embedded at index time; queries embedded at search time.