Postgresql uses slower constraint over faster index

Posted on

Question :

I have a few tables (25m records +) which have a very INSERT, UPDATE and DELETE biased workload (typically 50,000 to 100,000 operations a day) where the query planner seems to make an odd choice of index, favouring a constraint over other indexes. A bit of trial and error shows that use the constraint is typically somewhere between 80 to 300x slower than using one of our indexes.

As an example lets say our table looks something like:

Column                | Type                   | Nullable

id                    | uuid                   | not null
device_id             | character varying(255) | not null
device_child_id       | character varying(255) | not null
device_grandchild_id  | smallint               | not null
device_data_type      | character varying(255) | not null
device_data_unit      | character varying(255) | not null
data_date             | date                   | not null
<data>

And indexes:

"devices_pkey" PRIMARY KEY, btree (id)
...
"device_date" btree (device_id, data_date)
"device_child_grandchild" btree (device_id, device_child_id, device_grandchild_id)
"device_child_grandchild_date_constraint" EXCLUDE USING gist (data_date WITH =, device_id WITH =, device_child_id WITH =, device_grandchild_id WITH =, device_data_type WITH =, device_data_unit WITH =)

Some notes about the data:

  • There are around 50,000 unique device_id.
  • Each device_id may have up to 4 device_child_ids (but most have 1)
  • Each device_child_id may have up to 6 device_grandchild_ids (but most have 1)
  • Each of the combinations of the 3 ids above has a data_date and a bunch of data in other columns. The dates are not necessarily contiguous (though most are) and range from a few days to a few years – the largest set being around 4,500 records.

We use the constraint to ensure that we don’t have more than one row of data for the combination of the 4 fields above (and 2 others that don’t change currently).

Here’s the output from a little bit of EXPLAIN, first using the constraint:

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
SELECT
  id
FROM
  devices
WHERE 
device_id = '<device_id>'
  AND device_child_id = 'ABC123456789'
  AND device_grandchild_id = 1
  AND NOT (local_date <@ DATERANGE('2018-01-01', '2019-01-01', '[]'));
 Index Scan using device_child_grandchild_date_constraint on public.devices  (cost=0.42..2.64 rows=1 width=1780) (actual time=2.346..1396.550 rows=760 loops=1)
   Output: id
   Index Cond: (((devices.device_id)::text = '<device_id>'::text) AND ((devices.device_child_id)::text = 'ABC123456789'::text))
   Filter: ((NOT (devices.local_date <@ '[2018-01-01,2019-01-02)'::daterange)) AND (devices.device_grandchild_id = 1))
   Rows Removed by Filter: 1
   Buffers: shared hit=17315 read=2932 dirtied=2
   I/O Timings: read=1147.917
 Planning Time: 1.007 ms
 Execution Time: 1396.691 ms
(9 rows)

And again, but removing the device_child column from the query to trick the planner into using an index:

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
SELECT
  id
FROM
  devices
WHERE 
device_id = '<device_id>'
  -- AND device_child_id = 'ABC123456789'
  AND device_grandchild_id = 1
  AND NOT (local_date <@ DATERANGE('2018-01-01', '2019-01-01', '[]'));
 Index Scan using device_child_grandchild on public.devices  (cost=0.69..844.22 rows=722 width=1780) (actual time=3.192..8.517 rows=760 loops=1)
   Output: id
   Index Cond: (((devices.device_id)::text = '<device_id>'::text) AND (devices.device_grandchild_id = 1))
   Filter: (NOT (devices.local_date <@ '[2018-01-01,2019-01-02)'::daterange))
   Rows Removed by Filter: 2
   Buffers: shared hit=753 read=20
   I/O Timings: read=7.103
 Planning Time: 0.074 ms
 Execution Time: 8.590 ms

The biggest discrepancy I can see is the estimated and actual rows from the index scan portion of the query, i.e.
(cost=0.42..2.64 rows=1 width=1780) (actual time=2.346..1396.550 rows=760 loops=1) vs (cost=0.69..844.22 rows=722 width=1780) (actual time=3.192..8.517 rows=760 loops=1)

From reading a stack of other post, it seems like there is no way to tell the planner not to use the constraint and that the right way to address this is to improve the statistics gathering the planner uses – but I’m entirely unsure which columns to alter, or whether it’s the index that needs altering?

Any clues greatly appreciated!

Answer :

The planner lacks much insight into the inner workings of GiST indexes, which often leads to poor cost estimates for them.

However, I don’t see the point of the GiST index in the first place. Every column in it is using WITH =, so why not just use a unique constraint?

Leave a Reply

Your email address will not be published. Required fields are marked *