Get the max value from a slice of a JSON Array?

Posted on

Question :

I would like to get the max value within a slice of a Json object (typicaly [1,2,3,5,6,7,9,10]) which is contained in a field named Data of the table raw.

The limits Start & End of the slice are contained in an other Json object named Features contained in a table named features

Here is the input:

CREATE TABLE raw (
    id     int PRIMARY KEY
      GENERATED BY DEFAULT AS IDENTITY

    data   json
);

INSERT INTO raw (data) VALUES
  ('[1,2,3,5,6,7,9,10]');

CREATE TABLE features (
    id         int,
    features   json
);

INSERT INTO features (id, features) VALUES
  (1, '{"Start" : 1, "End": 5}');

The output I would like is 7, i.e. the max value of the slice [2,3,5,6,7]

Here is what I came up with looking at other posts, but it does not work…

SELECT
    R."ID",
    F."Features"->>'Start' AS Start, 
    F."Features"->>'End' AS End,
    sort_desc((array(select json_array_elements(R."Data")))[F."Features"->>'Start':F."Features"->>'End'])[1] as maxData
FROM 
    raw AS R
INNER JOIN 
    features AS F ON R."ID" = F."ID"

The approximate error message I get is concerning sort_desc :

No function corresponding to this name or this type of arguments. You
should convert the type of data

Answer :

You can unnest json array:

Postgres WITH ORDINALITY:

When a function in the FROM clause is suffixed by WITH ORDINALITY, a bigint column is appended to the output which starts from 1 and increments by 1 for each row of the function’s output. This is most useful in the case of set returning functions such as unnest().

Have a look at this answer of Erwin Brandstetter:

  SELECT
     r."ID",
     MAX(t.elem::int) MaxElem
  FROM   
      raw r
  JOIN
      features f
      ON f."ID" = r."ID"
  JOIN LATERAL
      json_array_elements_text(r."Data")
      WITH ORDINALITY AS t(elem, n) ON TRUE
  WHERE
      n >= (f."Features"->>'Start')::int + 1 
      AND 
      n <= (f."Features"->>'End')::int + 1
  GROUP BY
      r."ID";
ID | maxelem
-: | ------:
 1 |       7

db<>fiddle here

Or if you prefer to use intarray module:

SELECT
    r."ID",
    (sort_desc(((ARRAY(SELECT json_array_elements_text(r."Data")))::int[])[(f."Features"->>'Start')::int + 1:(f."Features"->>'End')::int + 1]))[1]
FROM
    raw r
JOIN
    features f
    ON f."ID" = r."ID";

rextester here

This is all around a horrible schema. You shouldn’t be using json (as compared with jsonb) at all, ever (practically). If you’re querying on the field, it should be jsonb. In your case, that’s still a bad idea though, you likely want an sql array..

CREATE TABLE raw (
    raw_id   int  PRIMARY KEY
      GENERATED BY DEFAULT AS IDENTITY,
    data     int[]
);

INSERT INTO raw (data) VALUES ('{1,2,3,5,6,7,9,10}');

CREATE TABLE features (
    feature_id   int REFERENCES raw,
    low          smallint,
    high         smallint
);

INSERT INTO features ( feature_id, low, high ) VALUES ( 1, 1, 5 );

Now you can query it like this, note remember sql is 1-based,

SELECT max(unnest)
FROM raw
CROSS JOIN features AS f
CROSS JOIN LATERAL unnest(data[f.low:f.high]);

Also check out the intarray module, because it’ll optimize the above,

CREATE EXTENSION intarray;

SELECT max(unnest)
FROM raw
CROSS JOIN features AS f
CROSS JOIN LATERAL unnest(subarray(data,f.low,f.high-f.low+1))

You can further optimize this if you know you just need the last element of the array.


Note if this is a GIS problem, you’re still probably doing it wrong, but at least this method is sane.

Leave a Reply

Your email address will not be published. Required fields are marked *