UPDATE and CASE problem [closed]

Posted on

Question :

It is my first question in this forum.
I wrote this code, but it is incorrect and I can’t find a problem.
I have 2 tables: sciana (gid,id,geom) and sciana1(gid,id,geom).
If table sciana is empty I want to sciana.id = sciana.gid, but if sciana is not null I want to find max(sciana.id) + sciana1.gid

UPDATE sciana1
SET id =
CASE
WHEN sciana.gid IS NULL THEN
    id = sciana1.gid
WHEN sciana.gid IS NOT NULL THEN
    (SELECT max(sciana.id)+sciana1.gid 
    FROM sciana )       
END;

First example:

`sciana` IS EMPTY, so after the update `sciana1` looks like

         1,1,geom1
         2,2,geom2
         ...

Second example:

sciana IS NOT EMPTY

         1,1,geom1,
         ...
         9,9,geom9

so after the update, sciana1 contains

         1,10,geom10
         2,11,geom11
         ...

Answer :

I think you want to do something like

UPDATE sciana1
   SET id = gid + COALESCE((SELECT max(id) FROM sciana), 0);

COALESCE() will return you the first non-NULL parameter. Then max(id) is NULL for an empty table, which means the COALESCE expression returns 0 when the table is empty, and the actual maximum when it is not.

You can further improve this by checking if the value will be different after the update, thus sparing unnecessarily bloating the table:

UPDATE sciana1
   SET id = gid + COALESCE((SELECT max(id) FROM sciana), 0)
 WHERE id IS DISTINCT FROM gid + COALESCE((SELECT max(id) FROM sciana), 0);

IS DISTINCT FROM is an operator that behaves similarly to <>, but also considers NULLs. With a plain <> you might not get anything updated, as the comparison itself results in NULL (thus not matching any rows) if sciana1.id is empty.

One more important thing is that if sciana is big, an index on id can help a lot in finding its maximum.

And a last note: in the second UPDATE query, the SELECT max()... part is executed twice. This can be prevented by moving it out to a CTE or a subquery like

UPDATE sciana1
   SET id = gid + COALESCE(m, 0)
  FROM (SELECT max(id) FROM sciana) AS t(m)
 WHERE id IS DISTINCT FROM gid + COALESCE(m, 0);

One should check which one results in a more efficient execution plan. In my experiments, the latter produces a nested join, which with a higher number of rows in both tables gets slower than the first version, while being faster with low row numbers.

Leave a Reply

Your email address will not be published. Required fields are marked *