Query Understanding, Divided into Three Parts

Introduction

Drawing from Caesar’s “Gallia est omnis divisa in partes tres,” query understanding divides into three distinct components:

Holistic understanding — broad classification of topic and intent
Reductionist understanding — breaking the query into components and determining meanings
Resolution — transforming understood queries into executable search engine commands

Part One: Holistic Understanding

Examines queries broadly without deep analysis. Key tasks:

Language Identification — Recognizes query language for multilingual search.

Query Categorization — Maps queries into a predefined taxonomy (e.g., “History of the Roman Empire”).

High-Level Intent Recognition — Identifies searcher motivations beyond literal meaning (e.g., quotation lookup vs. biography search).

Implementation

Rule-based systems (regex, heuristics)
Machine learning classifiers trained on labeled data
Character-level embeddings (e.g., fastText) for feature vectors

Part Two: Reductionist Understanding

Decomposes queries into meaningful segments and classifies them.

Query Segmentation — Divides queries into semantic units. Example: “roman empire poetry” → [“roman empire”, “poetry”].

Entity Recognition — Classifies each segment by type: “roman empire” = Subject, “poetry” = Genre.

Approaches

Historical: Hidden Markov Models (HMM), Conditional Random Fields (CRF)
Modern: Bidirectional LSTM-CRF models

Holistic understanding selects the appropriate reductionist model per category, simplifying each individual model and improving accuracy.

Part Three: Resolution

Translates understood queries into executable search backend queries.

Entity Mapping — Maps recognized entities to a knowledge base (taxonomies, ontologies, faceted classifications, knowledge graphs).

Query Assembly — Combines mapped entities and unmatched keywords into executable queries (AND operation as baseline).

Advanced resolution features:

Query expansion (broaden for recall)
Query relaxation (soften constraints for low-result queries)
Intent-based ranking model or collection selection
Facet selection based on intent

Conclusion

Implementing complete query understanding is substantial work — “query understanding can’t be built in one day.” The three-stage framework guides incremental prioritization and phased improvements.

People

Daniel Tunkelang

Awesome Search KG

Explorer

Query Understanding, Divided into Three Parts

Query Understanding, Divided into Three Parts

Introduction

Part One: Holistic Understanding

Implementation

Part Two: Reductionist Understanding

Approaches

Part Three: Resolution

Conclusion

People

Graph View

Table of Contents

Awesome Search KG

Explorer

Query Understanding, Divided into Three Parts

Query Understanding, Divided into Three Parts

Introduction

Part One: Holistic Understanding

Implementation

Part Two: Reductionist Understanding

Approaches

Part Three: Resolution

Conclusion

Related Concepts

People

Graph View

Table of Contents