Quepid

Open-source, web-based search relevance evaluation platform. Lets teams manage judgment lists, run queries against a live search engine, and compute ranking metrics (NDCG, MRR, P@K) interactively. Built and maintained by OpenSource Connections.


What It Does

Quepid is a “Test-Driven Relevancy” dashboard — the search equivalent of a unit test runner. You define test cases (query + expected relevant results), run them against your live search engine, and see metric scores per query and in aggregate.

Key workflows:

  • Judgment management — create, import, and maintain relevance grades for query/document pairs
  • Metric scoring — compute NDCG, MRR, P@K against your search engine in real time
  • Regression detection — compare metric snapshots across index or config changes
  • Custom scorers — write JavaScript scoring functions for non-standard metrics (e.g. custom NDCG@10)
  • Team collaboration — shared cases and scores across the relevance team

Scorer Architecture

Each Quepid test case runs a scorer — a JavaScript function with access to:

  • docs — the result set returned by the search engine
  • bestDocs — the ideal result set derived from judgments
  • setScore(value) — outputs the final score for the query

This makes it straightforward to implement NDCG, DCG, or custom business metrics.

When to Use Quepid vs. Scripts

QuepidPandas / scripts
Interactive explorationLarge-scale batch evaluation (>100K results)
Team collaborationCI/CD pipeline integration
Quick per-query inspectionCombining metrics with other signals
Non-technical stakeholder reviewCustom analysis across many system variants

People