How Netflix Content Engineering Makes a Federated Graph Searchable
Source: Part 1 | Part 2 Author: Netflix Content Engineering team
Summary
Two-part series on how Netflix makes their federated content knowledge graph searchable. Covers the challenges of searching across a distributed graph of entities (movies, shows, people, studios) and the architecture that supports it.
Core Problem
Netflix’s content catalog is represented as a federated graph — a distributed system where different teams own different entity types (titles, talent, studios, rights). Making this searchable requires:
- Aggregating entities across graph partitions
- Representing relationships, not just leaf documents
- Supporting both keyword and semantic queries over structured entity data
Key Architecture Concepts
Graph as Search Index
- Entities (movies, shows, people, studios) become indexed documents
- Relationships (actor → movie, director → studio) enriched as document features
- Graph traversals materialized as pre-computed search signals
Federated Query Routing
- Sub-queries routed to appropriate graph partition owners
- Results merged at the search orchestration layer
Vector Embeddings for Entities
- Entity embeddings capture semantic relationships from graph neighborhood
- Enable similarity search across entity types (“movies similar to X”)
Two-Part Series
- Part 1: Challenges and motivation — why traditional search fails for federated graphs
- Part 2: New architecture — embedding entities, federated query execution, search indexing
Related Concepts
- Knowledge Graph Search
- Dense Vector Retrieval
- Search Architecture
- Retrieval Pipeline
- Personalization