Mapping Search Queries To Search Intents

Core Insight

“Search queries are not the same as search intents.” Multiple distinct queries can map to a single intent (e.g., “mens shoes” and “shoes for men”).

Recognizing Query Equivalence

Surface Query Similarity Queries differing only in stemming, lemmatization, word order, or stop words often express identical intent. Includes singular/plural variations and compound word differences.

Similar Post-Search Behavior Equivalent-intent queries generate matching user engagement patterns. Behavioral similarity can be represented using vector embeddings of result titles.

Combined Approach

Group queries by surface similarity via canonicalization (stem tokens, alphabetize, remove stop words)
Split groups into behavioral clusters using vector cosine similarity

This ensures paired queries demonstrate both linguistic AND behavioral equivalence — preventing false positives like “dress shirt” vs. “shirt dress.”

Tradeoffs

Approach	Pros	Cons
Surface similarity only	Minimizes false positives	Misses synonyms and reformulations
Behavioral only	Captures semantic equivalence	Risks conflating different intents (e.g., “pants” vs. “dress pants”)
Combined	Best precision + recall	More complex to implement

Applications

Query Rewriting: Convert equivalent queries to canonical representations optimizing retrieval/ranking
Analytics: Aggregate fragmented behavioral signals across equivalent queries
Machine Learning: Use consolidated signals for more robust model training

People

Daniel Tunkelang

Awesome Search KG

Explorer

Mapping Search Queries To Search Intents

Mapping Search Queries To Search Intents

Core Insight

Recognizing Query Equivalence

Combined Approach

Tradeoffs

Applications

People

Graph View

Table of Contents

Backlinks

Awesome Search KG

Explorer

Mapping Search Queries To Search Intents

Mapping Search Queries To Search Intents

Core Insight

Recognizing Query Equivalence

Combined Approach

Tradeoffs

Applications

Related Concepts

People

Graph View

Table of Contents

Backlinks