The item table is like this
id category_id name registered
I have to show total number of items from the beginning. Suppose between 2013-05-01 to 20-05–05, I have to count all the items before 2013-05-01 and add that to count of 2013-05-01 ‘s items. The result have to be grouped by registered and category_id. I have to do this in one query. This is my query
SELECT registered, me.category_id, COUNT( * ) + (select count(*) from items where me.category_id = category_id and registered < '2013-05-01 00:00:00') AS Value FROM items me WHERE ( ( category_id IS NOT NULL AND (me.registered BETWEEN '2013-05-01 00:00:00' AND '2013-05-02 00:00:00') ) ) GROUP BY registered, category_id
It takes almost 28 secs to generate the result. How can i increase the performance.
I am using pgSQL. The registered and category_id is indexed. There are 568814 rows. I will be also very grateful, if any one can tell me that my sql is correct or not.
Sorry for late reply.
CREATE TABLE items ( id uuid NOT NULL DEFAULT uuid_generate_v4(), registered timestamp without time zone NOT NULL, modified timestamp without time zone NOT NULL, published timestamp without time zone, owner_id uuid, status_id integer NOT NULL, category_id uuid NOT NULL, location_id uuid, title text, description text, quantity integer NOT NULL DEFAULT 1, unit text, price_low bigint, price bigint, currency text NOT NULL DEFAULT 'BDT'::text, ip text, attributes_json text, from_mobile boolean NOT NULL DEFAULT false, CONSTRAINT items_pkey PRIMARY KEY (id), CONSTRAINT items_fk_category_id FOREIGN KEY (category_id) REFERENCES categories (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY IMMEDIATE, CONSTRAINT items_fk_location_id FOREIGN KEY (location_id) REFERENCES locations (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE, CONSTRAINT items_fk_owner_id FOREIGN KEY (owner_id) REFERENCES users (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY IMMEDIATE, CONSTRAINT items_fk_status_id FOREIGN KEY (status_id) REFERENCES item_statuses (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY IMMEDIATE )
What kind of column is [registered] -> datetime or varchar
Why are you grouping by the full value of the [registered] column ?
Ex. If you have the following values for this column:
2013-05-01 00:00:00 2013-05-01 00:00:01 2013-05-01 00:00:02 2013-05-01 00:00:03 2013-05-01 00:00:04
… and you group by them, you will have 5 rows in the output.
Grouping at the minute or hour level will get you less results -> faster query execution time
… if you have 568.814 rows, how many rows does your query return ? I suppose the result set is very close to 568.814 rows 🙂
In the current form, for each row generated by the primary query, the subquery has to be run. Try to avoid that …
Here my version of your query. Based on the same input, the output will be the same:
SELECT t1.registered, t1.category_id, (COUNT(*) + COALESCE(q2.c2,0)) AS Value FROM items t1 LEFT JOIN ( SELECT t2.category_id, COUNT(*) AS c2 FROM items t2 WHERE t2.registered < '2013-05-01 00:00:00' GROUP BY t2.category_id ) q2 ON t1.category_id = q2.category_id WHERE t1.registered BETWEEN '2013-05-01 00:00:00' AND '2013-05-02 00:00:00' GROUP BY t1.registered, t1.category_id, q2.c2 ;
REGISTERED CATEGORY_ID VALUE --------------------------------------------- May, 01 2013 00:00:00+0000 1 5 May, 01 2013 01:00:00+0000 1 5 May, 01 2013 02:00:00+0000 2 6 May, 01 2013 07:00:00+0000 3 3 May, 01 2013 08:00:00+0000 5 3 May, 01 2013 09:00:00+0000 1 5
HashAggregate (cost=35.27..35.32 rows=4 width=20) -> Hash Right Join (cost=33.63..35.23 rows=4 width=20) Hash Cond: (t2.category_id = t1.category_id) -> HashAggregate (cost=20.82..21.49 rows=67 width=4) -> Bitmap Heap Scan on items t2 (cost=6.27..19.52 rows=260 width=4) Recheck Cond: (registered < '2013-05-01 00:00:00'::timestamp without time zone) -> Bitmap Index Scan on idx_reg (cost=0.00..6.20 rows=260 width=0) Index Cond: (registered < '2013-05-01 00:00:00'::timestamp without time zone) -> Hash (cost=12.76..12.76 rows=4 width=12) -> Bitmap Heap Scan on items t1 (cost=4.29..12.76 rows=4 width=12) Recheck Cond: ((registered >= '2013-05-01 00:00:00'::timestamp without time zone) AND (registered <= '2013-05-02 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on idx_reg (cost=0.00..4.29 rows=4 width=0) Index Cond: ((registered >= '2013-05-01 00:00:00'::timestamp without time zone) AND (registered <= '2013-05-02 00:00:00'::timestamp without time zone))
For anyone interested, here is the sql fiddle page with sample data:
with items as ( select registered, category_id from items where category_id is not null and registered < '2013-05-02' ), items_restricted as ( select registered, category_id from items where registered >= '2013-05-01' ), category_id as ( select distinct category_id from items_restricted ) select distinct ir.registered, ir.category_id, "value" from items_restricted ir inner join ( select category_id, count(*) as value from items where exists ( select category_id from category_id where category_id = items.category_id ) group by category_id ) s using(category_id) order by registered, category_id