Question :
I would like to get the max value within a slice of a Json object (typicaly [1,2,3,5,6,7,9,10]
) which is contained in a field named Data
of the table raw
.
The limits Start
& End
of the slice are contained in an other Json object named Features
contained in a table named features
Here is the input:
CREATE TABLE raw (
id int PRIMARY KEY
GENERATED BY DEFAULT AS IDENTITY
data json
);
INSERT INTO raw (data) VALUES
('[1,2,3,5,6,7,9,10]');
CREATE TABLE features (
id int,
features json
);
INSERT INTO features (id, features) VALUES
(1, '{"Start" : 1, "End": 5}');
The output I would like is 7
, i.e. the max value of the slice [2,3,5,6,7]
Here is what I came up with looking at other posts, but it does not work…
SELECT
R."ID",
F."Features"->>'Start' AS Start,
F."Features"->>'End' AS End,
sort_desc((array(select json_array_elements(R."Data")))[F."Features"->>'Start':F."Features"->>'End'])[1] as maxData
FROM
raw AS R
INNER JOIN
features AS F ON R."ID" = F."ID"
The approximate error message I get is concerning sort_desc
:
No function corresponding to this name or this type of arguments. You
should convert the type of data
Answer :
You can unnest json array:
Postgres WITH ORDINALITY:
When a function in the FROM clause is suffixed by WITH ORDINALITY, a bigint column is appended to the output which starts from 1 and increments by 1 for each row of the function’s output. This is most useful in the case of set returning functions such as unnest().
Have a look at this answer of Erwin Brandstetter:
SELECT
r."ID",
MAX(t.elem::int) MaxElem
FROM
raw r
JOIN
features f
ON f."ID" = r."ID"
JOIN LATERAL
json_array_elements_text(r."Data")
WITH ORDINALITY AS t(elem, n) ON TRUE
WHERE
n >= (f."Features"->>'Start')::int + 1
AND
n <= (f."Features"->>'End')::int + 1
GROUP BY
r."ID";
ID | maxelem -: | ------: 1 | 7
db<>fiddle here
Or if you prefer to use intarray module:
SELECT
r."ID",
(sort_desc(((ARRAY(SELECT json_array_elements_text(r."Data")))::int[])[(f."Features"->>'Start')::int + 1:(f."Features"->>'End')::int + 1]))[1]
FROM
raw r
JOIN
features f
ON f."ID" = r."ID";
rextester here
This is all around a horrible schema. You shouldn’t be using json
(as compared with jsonb
) at all, ever (practically). If you’re querying on the field, it should be jsonb
. In your case, that’s still a bad idea though, you likely want an sql array..
CREATE TABLE raw (
raw_id int PRIMARY KEY
GENERATED BY DEFAULT AS IDENTITY,
data int[]
);
INSERT INTO raw (data) VALUES ('{1,2,3,5,6,7,9,10}');
CREATE TABLE features (
feature_id int REFERENCES raw,
low smallint,
high smallint
);
INSERT INTO features ( feature_id, low, high ) VALUES ( 1, 1, 5 );
Now you can query it like this, note remember sql is 1-based,
SELECT max(unnest)
FROM raw
CROSS JOIN features AS f
CROSS JOIN LATERAL unnest(data[f.low:f.high]);
Also check out the intarray
module, because it’ll optimize the above,
CREATE EXTENSION intarray;
SELECT max(unnest)
FROM raw
CROSS JOIN features AS f
CROSS JOIN LATERAL unnest(subarray(data,f.low,f.high-f.low+1))
You can further optimize this if you know you just need the last element of the array.
Note if this is a GIS problem, you’re still probably doing it wrong, but at least this method is sane.