Building Prefix Search Autocomplete Using Trigram Indexing
Back to Blog

Building Prefix Search Autocomplete Using Trigram Indexing

I needed to implement an autocomplete feature for a search field, so I looked into how these systems are usually built. Turns out the basic implementation is much simpler than I expected, and PostgreSQL alone can already handle a lot of it.

The idea is straightforward:

  • store search terms
  • track how often they are searched
  • query terms matching the user's prefix
  • add proper indexing so the queries stay fast

Storing Search Terms

For a simple autocomplete system, a single table is enough:

CREATE TABLE search_terms (
    term TEXT PRIMARY KEY,
    search_count INTEGER NOT NULL DEFAULT 0,
    updated_at TIMESTAMP NOT NULL DEFAULT NOW()
);

Whenever a user performs a search, increment the counter:

INSERT INTO search_terms(term, search_count)
VALUES ('pizza toronto', 1)
ON CONFLICT(term)
DO UPDATE SET
    search_count = search_terms.search_count + 1,
    updated_at = NOW();

Example data:

term search_count
pizza 1200
pizza hut 850
pizza toronto 430
pickup 300

Basic Autocomplete Query

When the user types: pi, the backend can query:

SELECT term, search_count
FROM search_terms
WHERE term ILIKE 'pi%'
ORDER BY search_count DESC
LIMIT 10;

This gives:

  • prefix matching
  • popularity ranking
  • deterministic results

Trigram Indexes

The important part is indexing. Without an index, PostgreSQL must scan the entire table for every autocomplete request, which becomes expensive once the dataset grows:

WHERE term ILIKE 'pi%'

PostgreSQL solves this using the pg_trgm extension, which implements trigram indexing.

A trigram is simply a sequence of 3 consecutive characters. For example:

pizza

is internally broken into:

" pi"
"piz"
"izz"
"zza"
"za "

PostgreSQL indexes these trigram fragments instead of entire strings. This allows it to quickly narrow down candidate matches for partial searches like:

  • LIKE
  • ILIKE
  • fuzzy matching
  • typo-tolerant matching

Enable trigram support:

CREATE EXTENSION pg_trgm;

Then create a GIN trigram index:

CREATE INDEX idx_search_terms_trgm
ON search_terms
USING gin (term gin_trgm_ops);

Now prefix searches become dramatically faster, even with large datasets.

Another nice benefit is typo tolerance. A query like piza can still match pizza using PostgreSQL similarity functions.