Implementing Typo-Tolerant Search with PostgreSQL
Back to Blog

Implementing Typo-Tolerant Search with PostgreSQL

After implementing a basic autocomplete system using trigram indexing in PostgreSQL, I wanted to add typo tolerance next.

I already covered the prefix-search side in Building Prefix Search Autocomplete Using Trigram Indexing, but prefix matching alone is not enough for real-world search.

For example, if a user types piza, the system should still be able to return:

pizza
pizza hut
pizza toronto

This is accomplished by implementing a typo-tolerant query.

Basic Typo-Tolerant Query

PostgreSQL provides the % operator for similarity matching.

Example:

SELECT
    term,
    similarity(term, 'piza') AS score
FROM search_terms
WHERE term % 'piza'
ORDER BY score DESC
LIMIT 10;

This searches for terms sufficiently similar to "piza" and ranks them by similarity score.

The similarity() function returns a value between:

  • 0 → completely different
  • 1 → identical

Similarity Threshold

The % operator uses a similarity threshold internally. The default threshold is 0.3, but you can adjust it:

SET pg_trgm.similarity_threshold = 0.2;

Lower thresholds:

  • allow fuzzier matches
  • return more results

Higher thresholds:

  • produce stricter matching
  • reduce false positives

Usually something around 0.2 - 0.4 works reasonably well for autocomplete.

Combining Prefix and Fuzzy Matching

In practice, it usually makes sense to combine:

  • prefix matching
  • typo tolerance
  • popularity ranking

Example:

SELECT
    term,
    search_count,
    similarity(term, 'piza') AS score
FROM search_terms
WHERE
    term ILIKE 'piza%'
    OR term % 'piza'
ORDER BY
    score DESC,
    search_count DESC
LIMIT 10;

This allows:

  • exact prefix matches first
  • typo-tolerant fallback matches
  • more popular searches ranked higher

Important Limitation

Similarity matching works much better once the input reaches at least 3 characters. Therefore, very short inputs like p or pi do not contain enough information for meaningful fuzzy matching.

A common approach is:

  • use normal prefix matching for short inputs
  • enable similarity matching after 3+ characters

That usually produces much better autocomplete behavior.