Postgres: strange long request: A is fast, B is fast, A WHERE IN B is slow

Posted on

Question :

There is a lot of rows in table “logs” (the Request A FROM).

Request A is quick and Request B is quick.

Put together, Request A WHERE IN Request B is very long. Is that normal?

Request A is very quick:

--request A
SELECT "rid", max("createdAt") as "createdAt"
FROM "logs"
WHERE "rid" IN (17,71,196,187,111,86,108,81,54,184,245,27,118,100,175,136,130,67,45)
GROUP BY "rid";

Request B is very quick:

--request B
SELECT "dr"."rid"
FROM (
  SELECT *
  FROM (
    SELECT "eid", "tid", count(*) over (partition by "eid", "tid") count, id as "rid"
    FROM rs
    WHERE "deletedAt" is NULL
  ) "nr"
WHERE count > 1
) "dr"
INNER JOIN teams ON teams.id = "dr"."tid"
INNER JOIN projects ON projects.id = teams."pid"
ORDER BY "eid", "tid"

The result of B is the same list of numbers than in request A WHERE IN.

Replacing request A WHERE IN by request B, it becomes very slow.

Postgres: 9.5.6

Query plan for Request A:
enter image description here

Query plan or Request B:
enter image description here

Query plan for Requast A WHEREIN B:
enter image description here

Answer :

Put together, Request A WHERE IN Request B is very long. Is that
normal?

Yes, that’s fairly common when elaborating queries. In the first case, the IN(...) clause has only a handful of constants. The planner knows exactly how many values and how frequent they are in the table, so it has a good chance to find an optimal execution plan (in this case, an index scan).

When the inside of this IN (...) clause is replaced by a subquery, the planner may mis-estimate the number of values that this subquery will produce, or their frequency, or both, especially if the subquery is complex, and so it may choose a non-optimal plan compared to when there is a list of constants inside IN (...).

Leave a Reply

Your email address will not be published. Required fields are marked *