Question :
I’d like to optimize Postgres response times for a query which looks for
select
orders.id,
orders.created
from
orders left join order_items on order_items.order_id = orders.id
where
orders.created >= '2011-10-02 23:00:00'
and orders.created <= '2017-12-09 22:59:59.999'
and orders.total_price >= '2000000'
and orders.total_price <= '2147483647'
and order_items.article_name ilike '%test%'
order by
orders.created desc limit '40' offset '0'
In my environment it’s typical that the date range is about 2-5 days and the stored orders are up to 2 years ago. 99% of the query time is used for the gist trigram index scan on article_name. The time range currently has negligible performance impact: it doesn’t matter if I search for orders within 2 days or 2 years.
The simplified DDL contains a compound index on order (created, id):
CREATE TABLE public.orders (
id int4 NOT NULL DEFAULT nextval('orders_id_seq'::regclass),
created timestamp NOT NULL,
total_price int8 NOT NULL,
CONSTRAINT orders_pkey PRIMARY KEY (id)
};
create
unique index order_created_id_idx on
orders
using btree(
created desc,
id
) ;
(
total_price is not null
)
or(
total_price <> 0
)
) ;
create
unique index order_total_price_id_idx_tx_id on
orders
using btree(
total_price desc,
transaction_id
)
where
(
total_price is not null
) ;
CREATE TABLE public.order_items (
id int4 NOT NULL DEFAULT nextval('order_items_id_seq'::regclass),
order_id int8 NOT NULL,
article_name text NULL,
CONSTRAINT order_items_pkey PRIMARY KEY (id),
CONSTRAINT order_items_order_fk FOREIGN KEY (order_id) REFERENCES public.orders(id) ON DELETE CASCADE
)
create
index order_items_article_name_with_like_support_idx on
order_items
using btree(
lower( article_name ) varchar_pattern_ops;
create
index order_item_article_name_trgm_idx on
order_items
using gist(
article_name gist_trgm_ops
) ;
and produces this query plan:
Limit (cost=12575.53..12575.63 rows=40 width=12) (actual time=5334.697..5334.705 rows=40 loops=1)
Output: orders.id, orders.created
Buffers: shared hit=762938 read=7003
I/O Timings: read=206.069
-> Sort (cost=12575.53..12576.39 rows=342 width=12) (actual time=5334.696..5334.701 rows=40 loops=1)
Output: orders.id, orders.created
Sort Key: orders.created DESC
Sort Method: top-N heapsort Memory: 26kB
Buffers: shared hit=762938 read=7003
I/O Timings: read=206.069
-> Nested Loop (cost=0.85..12564.72 rows=342 width=12) (actual time=54.817..5333.618 rows=1739 loops=1)
Output: orders.id, orders.created
Buffers: shared hit=762935 read=7003
I/O Timings: read=206.069
-> Index Scan using order_item_article_name_trgm_idx on public.order_items (cost=0.42..3907.02 rows=2520 width=8) (actual time=10.983..4908.059 rows=35000 loops=1)
Output: order_items.id, order_items.order_id, order_items.article_number, order_items.article_name, order_items.number_of_items, order_items.article_type, order_items.price_per_article, order_items.full_price, order_items.size, order_items.order_position
Index Cond: (order_items.article_name ~~* '%test%'::text)
Rows Removed by Index Recheck: 42821
Buffers: shared hit=622968 read=6719
I/O Timings: read=115.587
-> Index Scan using orders_pkey on public.orders (cost=0.43..3.43 rows=1 width=12) (actual time=0.012..0.012 rows=0 loops=35000)
Output: orders.id, orders.transaction_id, orders.order_number, orders.created, orders.total_price, orders.order_state, orders.shipment_type, orders.payment_type, orders.payment_status, orders.number_of_installments, orders.customer_id, orders.shipment_address_id, orders.invoiced_date, orders.shipped_from, orders.order_origin, orders.number_of_articles, orders.checkout_time, orders.referrer
Index Cond: (orders.id = order_items.order_id)
Filter: ((orders.created >= '2011-10-02 23:00:00'::timestamp without time zone) AND (orders.created <= '2017-12-09 22:59:59.999'::timestamp without time zone) AND (orders.total_price >= '2000000'::bigint) AND (orders.total_price <= '2147483647'::bigint))
Rows Removed by Filter: 1
Buffers: shared hit=139967 read=284
I/O Timings: read=90.482
Planning time: 2.112 ms
Execution time: 5334.814 ms
(How) Can I optimize this use case?
Answer :
Index on o.created – about the only thing. Indices on article_name will not work due to the leading and ending %.