Canva Search Pipeline Part I

Source: https://canvatechblog.com/search-pipeline-part-i-faa6c543aef1 Author: Stuart Cam (Canva Search & Recommendations team) Published: November 2022

Summary

First part of Canva’s search architecture overhaul. Describes the problems with their organically-grown “big ball of mud” architecture and motivates the redesign. Scale: ~20,000 requests/second at peak, 50+ unique entry points.

The Problem: Big Ball of Mud

Organic growth produced 4 separate search systems for templates, media, audio, and fonts — each with distinct codebases. The systems passed SolrQuery builder objects via string manipulation, locking them into Lucene syntax and blocking integration of new technologies (dense vectors, ML ranking).

Key Scale Facts

~20,000 requests/second at peak (public content search)
~50 unique entry points into search systems
70% of queries are single words (“cat”, “dog”)
20% contain two words
Users rarely engage past position 240 in results
Single-character queries common for CJK languages

Requirements for New System

ML integration for ranking and Personalization
Support for both search and recommendation queries
Technology-agnostic query DSL (not tied to Solr syntax)
Enhanced debugging and observability
Unified architecture across 4 content types

Team Context

~80 engineers across: ML, data science, backend, frontend, operations.

People

Stuart Cam (Canva)

Canva Search Pipeline Part II

Awesome Search KG

Explorer

Canva Search Pipeline Part I

Canva Search Pipeline Part I

Summary

The Problem: Big Ball of Mud

Key Scale Facts

Requirements for New System

Team Context

People

Graph View

Table of Contents

Backlinks

Awesome Search KG

Explorer

Canva Search Pipeline Part I

Canva Search Pipeline Part I

Summary

The Problem: Big Ball of Mud

Key Scale Facts

Requirements for New System

Team Context

Related Concepts

People

Related Articles

Graph View

Table of Contents

Backlinks