Very slow query in PostgreSQL on 10kk records table

Posted on

Question :

I’m having very slow requests to my queries. I created an index on the filter columns and made partitioning but requests is still too slow. How can I make it faster?
Here’s query and explain:

SELECT
    (
        date_trunc('day', install_time)
    ) AS "d",
    COUNT (
        "application_events"."id"
    ) AS "agg"
FROM
    "application_events"
WHERE
    (
        "application_events"."application_id" = 9
        AND "application_events"."event_type" = 'install'
        AND "application_events"."install_time" BETWEEN '2015-10-01 00:00:00+00:00'
        AND '2015-10-14 23:59:59.999999+00:00'
    )
GROUP BY
    (
        date_trunc('day', install_time)
    )

EXPLAIN ANALYZE output:

HashAggregate  (cost=349746.20..349755.53 rows=746 width=12) (actual time=91074.961..91074.979 rows=14 loops=1)
  ->  Result  (cost=3364.54..349742.47 rows=746 width=12) (actual time=5443.146..91065.907 rows=2232 loops=1)
        ->  Append  (cost=3364.54..349740.61 rows=746 width=12) (actual time=5443.125..91047.422 rows=2232 loops=1)
              ->  Bitmap Heap Scan on application_events  (cost=3364.54..222625.69 rows=737 width=12) (actual time=5443.124..64304.831 rows=2232 loops=1)
                    Recheck Cond: (application_id = 9)
                    Rows Removed by Index Recheck: 1325863
                    Filter: ((install_time >= '2015-10-01 00:00:00+00'::timestamp with time zone) AND (install_time <= '2015-10-14 23:59:59.999999+00'::timestamp with time zone) AND (event_type = 'install'::text))
                    Rows Removed by Filter: 170837
                    ->  Bitmap Index Scan on application_events_application_id  (cost=0.00..3364.36 rows=177590 width=0) (actual time=2889.887..2889.887 rows=173069 loops=1)
                          Index Cond: (application_id = 9)
              ->  Bitmap Heap Scan on application_events_y2015m09  (cost=1702.64..127114.92 rows=9 width=12) (actual time=26738.094..26738.094 rows=0 loops=1)
                    Recheck Cond: (application_id = 9)
                    Filter: ((install_time >= '2015-10-01 00:00:00+00'::timestamp with time zone) AND (install_time <= '2015-10-14 23:59:59.999999+00'::timestamp with time zone) AND (event_type = 'install'::text))
                    Rows Removed by Filter: 99474
                    ->  Bitmap Index Scan on application_events_y2015m09_application_id_idx  (cost=0.00..1702.64 rows=101095 width=0) (actual time=480.529..480.529 rows=99474 loops=1)
                          Index Cond: (application_id = 9)
Total runtime: 91076.032 ms

Full structure

/*
Target Server Type    : PGSQL
Target Server Version : 90309
File Encoding         : 65001
Date: 2015-10-14 19:09:50
*/


CREATE TABLE "public"."application_events" (
"id" int4 DEFAULT nextval('application_events_id_seq'::regclass) NOT NULL,
"application_id" int4 NOT NULL,
"device_id" int4 NOT NULL,
"install_time" timestamptz(6) NOT NULL,
"event_date" date NOT NULL,
"event_time" timestamptz(6) NOT NULL,
"event_name" text COLLATE "default" NOT NULL,
"event_value" text COLLATE "default" NOT NULL,
"event_type" text COLLATE "default" NOT NULL,
)    ;
-- ----------------------------
-- Alter Sequences Owned By 
-- ----------------------------

-- ----------------------------
-- Indexes structure for table application_events
-- ----------------------------
CREATE INDEX "application_events_application_id" ON "public"."application_events" USING btree (application_id);
CREATE INDEX "application_events_device_id" ON "public"."application_events" USING btree (device_id);
CREATE INDEX "application_events_event_date" ON "public"."application_events" USING btree (event_date);
CREATE INDEX "application_events_event_name" ON "public"."application_events" USING btree (event_name);
CREATE INDEX "application_events_event_name_like" ON "public"."application_events" USING btree (event_name "pg_catalog"."text_pattern_ops");

-- ----------------------------
-- Triggers structure for table application_events
-- ----------------------------
CREATE TRIGGER "after_insert_application_events_trigger" AFTER INSERT ON "public"."application_events"
FOR EACH ROW
EXECUTE PROCEDURE "application_events_delete_master"();
CREATE TRIGGER "before_insert_application_events_trigger" BEFORE INSERT ON "public"."application_events"
FOR EACH ROW
EXECUTE PROCEDURE "application_events_insert_child"();
CREATE TRIGGER "bucardo_delta" AFTER INSERT OR UPDATE OR DELETE ON "public"."application_events"
FOR EACH ROW
EXECUTE PROCEDURE "bucardo"."delta_public_application_events"();
CREATE TRIGGER "bucardo_kick_sync" AFTER TRUNCATE ON "public"."application_events"
FOR EACH STATEMENT
EXECUTE PROCEDURE "bucardo"."bucardo_kick_sync"();
CREATE TRIGGER "bucardo_note_trunc_sync" AFTER TRUNCATE ON "public"."application_events"
FOR EACH STATEMENT
EXECUTE PROCEDURE "bucardo"."bucardo_note_truncation"('sync');

-- ----------------------------
-- Primary Key structure for table application_events
-- ----------------------------
ALTER TABLE "public"."application_events" ADD PRIMARY KEY ("id");

-- ----------------------------
-- Foreign Key structure for table "public"."application_events"
-- ----------------------------
ALTER TABLE "public"."application_events" ADD FOREIGN KEY ("device_id") REFERENCES "public"."application_devices" ("id") ON DELETE NO ACTION ON UPDATE NO ACTION DEFERRABLE INITIALLY DEFERRED;
ALTER TABLE "public"."application_events" ADD FOREIGN KEY ("application_id") REFERENCES "public"."applications" ("id") ON DELETE NO ACTION ON UPDATE NO ACTION DEFERRABLE INITIALLY DEFERRED;

Table analyze.

INFO:  analyzing "public.application_events"

INFO:  "application_events": scanned 4 of 4 pages, containing 0 live rows and 48 dead rows; 0 rows in sample, 0 estimated total rows

INFO:  analyzing "public.application_events" inheritance tree

INFO:  "application_events_y2015m09": scanned 11584 of 129704 pages, containing 368816 live rows and 419 dead rows; 11584 rows in sample, 4133206 estimated total rows

INFO:  "application_events_y2015m10": scanned 18415 of 206186 pages, containing 582845 live rows and 0 dead rows; 18415 rows in sample, 6524143 estimated total rows

Answer :

Index

To get best performance for the demonstrated query, create this multicolumn index:

CREATE INDEX application_events_combi_idx ON public.application_events
(application_id, event_type, install_time);

The order of columns is important. The rule of thumb: equality first, range later:

Since I only know the one query I can’t say if the the cost of the index outweighs the benefit. Generally, if you run this kind of query a lot, and the write performance is not a bottleneck, the index pays.

If one of the predicates is a constant, a partial index might be more efficient. I don’t have enough information about purpose and scope of the query to tell.

If your write load isn’t exceptionally wild, and you have autovacuum running with a sane configuration, you should be able to get an index-only scan out of this. (Then you don’t see a bitmap-index scan at all.)

BTW, if you would normalize your table design and have event_type_id integer (like @Alexandros commented) instead of event_type text, you would only need 44 bytes per index entry, while it occupies 52 bytes or more with the text field. And integer is a bit faster for everything on top of that.

Query

Your query, only slightly improved, but without all the noise and redundant parentheses:

SELECT date_trunc('day', install_time) AS d, count(*) AS agg
FROM   application_events
WHERE  application_id = 9
AND    event_type = 'install'
AND    install_time >= '2015-10-01 0:0+0'
AND    install_time <  '2015-10-15 0:0+0'
GROUP  BY 1;
  • id is guaranteed to be NOT NULL in this query, so we can replace count(id) with count(*) which is a tiny bit faster.

  • a BETWEEN b AND c is rewritten internally to a >= b AND a <= c. It is typically more convenient use the next day as exclusive upper border instead. Same result, same performance.

  • All the parentheses I removed were just noise. No other effect.

All the basics for performance optimization apply, of course.

Table layout

If you can afford to rewrite your table: Switching the position of two columns saves 8 bytes per row: 2x 4 bytes of padding before the timestamptz columns. 80 MB or more for 10M rows, and size is one of the important factors for performance:

CREATE TABLE public.application_events (
   id             serial PRIMARY KEY
   application_id int4 NOT NULL,
   device_id      int4 NOT NULL,
   event_date     date NOT NULL,         -- switched these two columns ..
   install_time   timestamptz NOT NULL,  -- .. to save 8 bytes per row.
   event_time     timestamptz NOT NULL,
   event_name     text NOT NULL,
   event_value    text NOT NULL,
   event_type     text NOT NULL,
);

The only other tiny thing is event_type like discussed above. Else, the table is perfect now.

@Erwin is pointing you generally in the right direction; it is a good idea to make indexes which match up well to your desired queries, especially when you see performance is really slow.

However, in your particular case, I see a big red flag!

Lossy Pages in the Bitmap Index Scan

You’ll note that you have a Bitmap Index Scan on application_events_application_id, for the condition that application_id = 9.

However, just above that, we see a Bitmap Heap Scan on application_events, but observe the over one million rows removed on Rows Removed by Index Recheck.

What this implies is that you may not have set your work_mem appropriately, and your bitmap is becoming too large to fit in memory. This forces a conversion to a lossy bitmap index, and subsequently means that the bitmap heap scan has to do a lot of rechecking when pulling the data from the table.

Long story short: run an EXPLAIN (ANALYZE,BUFFERS), and I’ll bet you anything that we’ll see that there are lossy pages in your heap scan. We can then tune your work_mem parameter to help speed this up quite a bit.

To be continued…?

Now, that isn’t a guaranteed cure-all. Once that part is done, you also need to do some further debugging to see if it actually makes sense for the database to be choosing to filter over the install_time column, rather than accessing an index.

Leave a Reply

Your email address will not be published. Required fields are marked *