PostgreSQL 9.6 optimizer uses index even though it’s slower

Posted on

Question :

I’m new to indexes and plans, so I had to ask for some help.

I have a table where I keep the following attributes:

id text NOT NULL 
t timestamp NOT NULL 
cost int NOT NULL

My query on the db is :

SELECT AVG(cost)
FROM my_table
WHERE my_to_char(t) = 'Sunday   '
;

The return time of the above query is about 4secs.
So, in order to increase query’s performance I created an index on my_to_char(t).

But this proved to have the opposite result. The query now returns within 30 secs.

However, the query planner chooses to use the index rather than go for a sequential scan.

How can the index be slower than seq scan on the first place. Any explanation?

Maybe because there are too many rows with my_to_char(t) = 'Sunday '?

EDIT – Query Plan

Aggregate  (cost=72841.43..72841.44 rows=1 width=32) (actual time=28383.473..28383.473 rows=1 loops=1)
Output: avg(cost)
->  Bitmap Heap Scan on my_table  (cost=900.60..72732.77 rows=43462 width=4) (actual time=120.778..28091.814 rows=1237954 loops=1)
Output: id, t, cost
Recheck Cond: (my_to_char(my_table.t) = 'Sunday   '::text)
Rows Removed by Index Recheck: 3053757
Heap Blocks: exact=33988 lossy=26432
->  Bitmap Index Scan on btree_date  (cost=0.00..889.74 rows=43462 width=0) (actual time=111.785..111.785 rows=1237954 loops=1)
Planning time: 0.270 ms
Execution time: 28384.284 ms

Answer :

After adding an expression index to a table, you need to manually ANALYZE the table. Otherwise PostgreSQL does not have access to the data it needs to make good decisions about the proper usage of that index. It has to make wild guesses, and obviously those guesses are poor in this case (it thinks it will find 43462 rows, but actually finds 1237954)

Perhaps adding an expression index should automatically trigger an ANALYZE to take place–but currently it does not work that way.

Leave a Reply

Your email address will not be published. Required fields are marked *