Canva

Design platform used by 150M+ people. Search powers discovery of templates, photos, audio, fonts, and video across a massive creative asset catalog.

Search Context

Canva’s search challenge is breadth: four fundamentally different content types (templates, media, audio, fonts) originally had four separate search codebases. The team rebuilt all of them on a shared componentized architecture.

Scale: ~20,000 requests/second at peak. 70% of queries are single words (“cat”, “dog”). 20% are two words. Single-character queries common for CJK languages.

Key Engineering Work

  • Migrated from “big ball of mud” (4 separate Solr/custom codebases) to shared pipeline architecture
  • New pipeline: Validation → Tokenization → Annotation → Candidate Generation → Feature Extraction → Re-ranking
  • 50ms deadline-based candidate generation: parallel generators, return top-500 for early pages; lighter execution for deep pages
  • Technology-agnostic query DSL decoupled from Lucene syntax
  • Enabled ML integration for ranking and personalization across all 4 content types

Articles