Question :
Say I have a table with a jsonb column that contains data structured like this:
{
"records": [
{
"id": 1,
"tag": "address"
},{
"id": 1,
"tag": "receipt"
}
]
}
I’d like to be able to index this column such that I can request “give me all rows where there is a record of type ‘receipt'”.
I’m trying to find an expression that would output a jsonb array that looks like
["address","receipt"]
so then I can plug that expression into
CREATE INDEX ON t1 USING GIN (expr)
I’ve been playing around with all different combinations of the JSONB functions, but I think I’m in over my head.
Answer :
I wouldn’t store the data like that in the first place, if I could help it. Assuming that that ship has sailed, I would at least transform it to a native PostgreSQL array, not a JSON(B) array, as that would give you better statistics estimates and so probably better execution plans, as well as just plain faster execution.
Due to some arbitrary-seeming restrictions on the composition of set-returning functions with aggregations and expression indexes, I couldn’t find a way to do this without the help of some helper function (which is probably a good idea anyway).
CREATE OR REPLACE FUNCTION public.helper(jsonb)
RETURNS text[]
LANGUAGE sql
IMMUTABLE
AS $function$
select array_agg(x) from
(select jsonb_array_elements($1->'records')->>'tag') as f(x);
$function$
Followed by:
CREATE INDEX ON t1 USING GIN (public.helper(jsonb_colname));
And then:
select * from t1 where public.helper(x) && ARRAY['receipt'];
But taking a step back, you could do this directly using the built-in jsonb containment operators with no transformations or expressions at all (although the index will probably larger and slower than a tailored one):
create index on t1 using gin (jsonb_colname);
select * from t1 where jsonb_colname @> '{"records":[{"tag":"receipt"}]}';