Autocomplete and Autosuggest

How to build search suggestions that help users express intent faster. Autocomplete (prefix completion) and autosuggest (semantic fallback) are two distinct problems with different architectures.

Autocomplete vs. Autosuggest

	Autocomplete	Autosuggest
Trigger	User typing a prefix	Query typed but intent unclear
Mechanism	Prefix index / FST lookup	Semantic similarity / embedding KNN
Goal	Complete the query the user is typing	Suggest alternative framings
Response time	<10ms	<50ms

Both are often displayed in the same suggestion dropdown, but use different backends.

Retrieval: How to Get Candidates Fast

Edge N-gram Index

Pre-index all suggestion strings with edge n-grams at index time. Fast prefix lookup at query time, but large index size.

Finite State Transducer (FST)

Compact automaton for prefix matching. Lucene uses FSTs internally for its suggest component. Orders of magnitude more memory-efficient than edge n-grams.

Trie

Simple prefix tree. Good for small vocabularies; doesn’t scale to millions of suggestions.

Practical Choice

Use your search engine’s native suggest feature (Elasticsearch’s search_as_you_type, completion suggester) for most cases. Only build custom FST if you need custom ranking or multi-field matching.

Ranking: Once You Have Candidates

Baseline: Frequency-Based

Sort by historical query frequency. Fast, interpretable, but ignores context.

Probabilistic Model

P(C | prefix) ∝ P(prefix | C) × P(C) — noisy channel framing. P(C) = frequency prior; P(prefix|C) = edit model. Used in spell-correction-aware autocomplete.

LTR Feature Categories

Temporal: recency of last usage, trending signal (24h spike), seasonality

Behavioral: CTR of the completion if clicked, conversion rate, session abandonment rate

Contextual: user location, language/locale, session history (what has the user already searched?)

Query: length of prefix typed, specificity of the candidate

Two-Phase for Latency

Fast retrieval (<10ms): FST/edge n-grams → top-100 candidates
Ranking (<30ms): LTR model to re-score and reorder

Personalization

User-specific ranking:

Boost completions the user has typed before (with recency decay)
Boost completions in categories the user frequently shops
Cold start: fall back to location-based defaults, demographic cohort, or pure frequency

Cold Start: No Query History

New search products have no query logs to build suggestions from. Options:

Seed from catalog: extract noun phrases from product descriptions as initial suggestions
LLM generation: prompt an LLM to generate realistic search queries for each product/category. David Albrecht’s approach: GPT-4.1-nano generated 10 queries per product for 43k items at <$5 cost. Produces novel phrases traditional NLP can’t extract.
Competitor/open data: use open query datasets as bootstrapping signal (MS MARCO, Amazon ESCI)

LLM-generated queries solve cold-start while producing user-like phrases. Validate over time with click/conversion data.

Scaling Autocomplete

The naive approach (significant_terms aggregation on shingles) collapses at millions of documents — CPU saturation and multi-second queries.

What works at scale:

Separate autocomplete index (different scaling profile from main search)
search_as_you_type field type in Elasticsearch + match_phrase_prefix
terms aggregation (frequency-ordered) instead of significant_phrases
Debounce 25-50ms before firing (saves 40-50% of requests)
Views/popularity as primary ranking signal via field_value_factor + log1p

At 6M+ docs: achieves 416 req/s at p99 ≤ 45ms.

Common Failure Modes

Showing irrelevant completions: frequency ranking without context → “apple” suggests “apple pie” to a developer

Stale suggestions: popular-last-year completions that nobody searches now — need recency decay and trending signal

Overly long completions: users want to type less, not see their full query echoed back

Zero-results for accepted suggestion: suggestion exists but accepted query returns no results — catastrophic UX failure

Query Understanding in Practice — suggestions feed into QU pipeline
Personalization in Search — user-specific ranking of completions
E-commerce Search — autocomplete is especially high-ROI in product search

Awesome Search KG

Explorer

Autocomplete and Autosuggest

Autocomplete and Autosuggest

Autocomplete vs. Autosuggest

Retrieval: How to Get Candidates Fast

Edge N-gram Index

Finite State Transducer (FST)

Trie

Practical Choice

Ranking: Once You Have Candidates

Baseline: Frequency-Based

Probabilistic Model

LTR Feature Categories

Two-Phase for Latency

Personalization

Cold Start: No Query History

Scaling Autocomplete

Common Failure Modes

Graph View

Table of Contents

Backlinks

Awesome Search KG

Explorer

Autocomplete and Autosuggest

Autocomplete and Autosuggest

Autocomplete vs. Autosuggest

Retrieval: How to Get Candidates Fast

Edge N-gram Index

Finite State Transducer (FST)

Trie

Practical Choice

Ranking: Once You Have Candidates

Baseline: Frequency-Based

Probabilistic Model

LTR Feature Categories

Two-Phase for Latency

Personalization

Cold Start: No Query History

Scaling Autocomplete

Common Failure Modes

Related

Graph View

Table of Contents

Backlinks