Question :
We have a database where we track the price of products from different sellers. If I strip everythin else it looks like this:
I need to make a report where given a date range I have to return a row for every day and the prices for every seller. There will be days that some seller don’t have a price for a product and there will be days that a seller will have more than one price for a given product.
I’m doing it by creating a temp table with the date ranges then join them with the sellers and prices for this but once product_price have a couple thousands rows it get VERY slow (about 4” to return a 200 rows) and I don’t know how to optimize it to get it fast enought.
I can’t think a workaround for the temp table nor which index should I create.
schema and query here: DB-fiddle
Answer :
You have only primary keys for tables and indexes related to the FK restrictions but no indexes related to the queries you run.
SELECT qtm.date_i
, s.name
, pp.*
FROM query_time_window AS qtm
JOIN seller AS s -- some ON condition is strictly recommended
LEFT JOIN product_price AS pp ON pp.seller_id = s.id -- most conditions are moved to WHERE
WHERE ( pp.product_id = 1 OR pp.product_id IS NULL )
AND ( pp.created_at <= qtm.date_f -- I've changed '!a>b' to 'a<=b'
AND ( pp.cancelled_at IS NULL
OR pp.cancelled_at >= qtm.date_i) -- I've changed '!a<b' to 'a>=b'
)
ORDER BY qtm.date_i, s.id;
Table query_time_window
should have an index (date_i, date_f) or (date_f, date_i)
Table product_price
should have an additional index (product_id, seller_id, created_at, cancelled_at) – an exact order of the columns depend on the real data. You can try different combinations and choose the fastest one.