How to obtain the path to the match of a JSONPath query in PostgreSQL 14?

Posted on

Question :

Here is a minimal working example that illustrates my point, the query finds all the values of keys named "$" at any level in the given JSON:

select jsonb_path_query(
$$
{
  "id": "test",
  "data": [
    {
      "$": "ref1"
    },
    {
      "$": "ref2"
    },
    {
      "engine": "dual",
      "fuel": "nuclear"
    }
  ],
  "owner": {
    "$": "ref3"
  },
  "bazaar": {
        "manager": {
        "$": "ref4"
      }
  }
}$$,
    'strict $.**."$"'
);

The returned results are:

"ref1"
"ref2"
"ref3"
"ref4"

However, what I need is a list of paths to these matches, i.e.,

"$['data'][0]['$']",
"$['data'][1]['$']",
"$['owner']['$']",
"$['bazaar']['manager']['$']"

How can this be accomplished?

Context: a database contains jsonb values, where the $ sign is a placeholder for a value stored in a different table. The choice of $ is unfortunate, because it is also a part of JSONPath syntax, but this is a legacy system, so things have to stay this way.

Let’s focus on "ref4" from my sample query. In another table, ref4 points to {"name": "Balthazar", "occupation": "engineer"}. My goal is to substitute the references in the original structure with the objects they point to. In other words, the JSON will change from ... "bazaar": {"owner": {"$": "ref4"}} to "bazaar": {"owner": {"name": "Balthazar", "occupation": "engineer"}}.

My intention is to write a function that will be called recursively, which looks for the $ entries and replaces them with the values from the other table, then returns the final JSON.

In the same context, I’d like to find out if you can suggest better ways to accomplish the objective.

Answer :

CREATE OR REPLACE FUNCTION jsonb_paths (data jsonb, prefix text[]) RETURNS SETOF text[] LANGUAGE plpgsql AS $$
DECLARE
    key text;
    value jsonb;
    path text[];
    counter integer := 0;
BEGIN
    IF jsonb_typeof(data) = 'object' THEN
        FOR key, value IN
            SELECT * FROM jsonb_each(data)
        LOOP
            IF jsonb_typeof(value) IN ('array', 'object') THEN
                FOR path IN
                    SELECT * FROM jsonb_paths (value, array_append(prefix, key))
                LOOP
                    RETURN NEXT path;
                END LOOP;
            ELSE
                RETURN NEXT array_append(prefix, key);
            END IF;
        END LOOP;
    ELSIF jsonb_typeof(data) = 'array' THEN
        FOR value IN
            SELECT * FROM jsonb_array_elements(data)
        LOOP
            IF jsonb_typeof(value) IN ('array', 'object') THEN
                FOR path IN
                    SELECT * FROM jsonb_paths (value, array_append(prefix, counter::text))
                LOOP
                    RETURN NEXT path;
                END LOOP;
            ELSE
                RETURN NEXT array_append(prefix, counter::text);
            END IF;
            counter := counter + 1;
        END LOOP;
    END IF;
END
$$;

Usage:

select * from jsonb_paths($$
{
  "id": "test",
  "data": [
    {
      "$": "ref1"
    },
    {
      "$": "ref2"
    },
    {
      "engine": "dual",
      "fuel": "nuclear"
    }
  ],
  "owner": {
    "$": "ref3"
  },
  "bazaar": {
        "manager": {
        "$": "ref4"
      }
  }
}$$, '{}') path where path @> '{$}';
┌────────────────────┐
│        path        │
├────────────────────┤
│ {data,0,$}         │
│ {data,1,$}         │
│ {owner,$}          │
│ {bazaar,manager,$} │
└────────────────────┘
(4 rows)

The paths returned are in #> operator or jsonb_set function format, but that’s easy to change.

Leave a Reply

Your email address will not be published. Required fields are marked *