Moving from MySQL JSON to Postgres arrays
Back to Blog

Moving from MySQL JSON to Postgres arrays

When migrating from MySQL to Postgres, one of the common surprises is how often JSON columns show up in places that are really just lists. Tags are a typical example:

["sushi", "japanese", "late-night"]

In MySQL this is usually fine, it's flexible, easy, and doesn't require much thought. But once you move into Postgres, you start feeling the mismatch between "this is JSON" and "this is actually just a list I want to query".

That's often where arrays start to look more natural:

tags text[]

It's a small shift, but it changes the way the data behaves. Instead of a serialized structure, it becomes something the database understands as a first-class list type.

Working with arrays and making them usable

Once tags are stored as arrays, querying becomes more direct.

To check if a tag exists:

SELECT *
FROM restaurants
WHERE 'sushi' = ANY(tags);

Or using the containment operator:

SELECT *
FROM restaurants
WHERE tags @> ARRAY['sushi'];

Modifying data is also straightforward:

UPDATE restaurants
SET tags = array_remove(tags, 'sushi');

At this point, arrays feel simple and readable, especially compared to parsing JSON strings or maintaining join tables for very lightweight use cases.

The key factor that makes arrays actually viable at scale is indexing. Without it, Postgres still has to scan rows to evaluate array conditions, which quickly becomes expensive on larger datasets.

That's where GIN indexes come in:

CREATE INDEX idx_restaurants_tags
ON restaurants
USING GIN (tags);

With a GIN index in place, queries like:

WHERE 'sushi' = ANY(tags)

or:

WHERE tags @> ARRAY['sushi']

can be resolved efficiently without full table scans.

Tradeoffs and where arrays fit

Arrays sit in a middle ground between JSON and full relational modeling. They work well when the data is genuinely list-like and doesn't need additional structure.

They are a good fit when:

  • the values are simple (tags, labels, flags)
  • the primary operation is filtering or membership checks
  • schema simplicity is preferred over normalization

They become less suitable when:

  • individual items need their own attributes
  • relationships between items matter
  • the "list" starts behaving like a first-class entity

In those cases, a join table tends to be a better long-term structure.