Question :
The item table is like this
id category_id name registered
I have to show total number of items from the beginning. Suppose between 2013-05-01 to 20-05–05, I have to count all the items before 2013-05-01 and add that to count of 2013-05-01 ‘s items. The result have to be grouped by registered and category_id. I have to do this in one query. This is my query
SELECT registered, me.category_id,
COUNT( * ) + (select count(*) from items where me.category_id = category_id
and registered < '2013-05-01 00:00:00') AS Value FROM items me
WHERE (
(
category_id IS NOT NULL
AND
(me.registered BETWEEN '2013-05-01 00:00:00' AND '2013-05-02 00:00:00')
)
)
GROUP BY registered, category_id
It takes almost 28 secs to generate the result. How can i increase the performance.
Update
I am using pgSQL. The registered and category_id is indexed. There are 568814 rows. I will be also very grateful, if any one can tell me that my sql is correct or not.
Update
Sorry for late reply.
CREATE TABLE items
(
id uuid NOT NULL DEFAULT uuid_generate_v4(),
registered timestamp without time zone NOT NULL,
modified timestamp without time zone NOT NULL,
published timestamp without time zone,
owner_id uuid,
status_id integer NOT NULL,
category_id uuid NOT NULL,
location_id uuid,
title text,
description text,
quantity integer NOT NULL DEFAULT 1,
unit text,
price_low bigint,
price bigint,
currency text NOT NULL DEFAULT 'BDT'::text,
ip text,
attributes_json text,
from_mobile boolean NOT NULL DEFAULT false,
CONSTRAINT items_pkey PRIMARY KEY (id),
CONSTRAINT items_fk_category_id FOREIGN KEY (category_id)
REFERENCES categories (id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY IMMEDIATE,
CONSTRAINT items_fk_location_id FOREIGN KEY (location_id)
REFERENCES locations (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
CONSTRAINT items_fk_owner_id FOREIGN KEY (owner_id)
REFERENCES users (id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY IMMEDIATE,
CONSTRAINT items_fk_status_id FOREIGN KEY (status_id)
REFERENCES item_statuses (id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY IMMEDIATE
)
Answer :
-
What kind of column is [registered] -> datetime or varchar
-
Why are you grouping by the full value of the [registered] column ?
Ex. If you have the following values for this column:2013-05-01 00:00:00 2013-05-01 00:00:01 2013-05-01 00:00:02 2013-05-01 00:00:03 2013-05-01 00:00:04
… and you group by them, you will have 5 rows in the output.
Grouping at the minute or hour level will get you less results -> faster query execution time
… if you have 568.814 rows, how many rows does your query return ? I suppose the result set is very close to 568.814 rows 🙂 -
In the current form, for each row generated by the primary query, the subquery has to be run. Try to avoid that …
Here my version of your query. Based on the same input, the output will be the same:
SELECT
t1.registered,
t1.category_id,
(COUNT(*) + COALESCE(q2.c2,0)) AS Value
FROM
items t1
LEFT JOIN
(
SELECT
t2.category_id,
COUNT(*) AS c2
FROM
items t2
WHERE
t2.registered < '2013-05-01 00:00:00'
GROUP BY
t2.category_id
) q2
ON t1.category_id = q2.category_id
WHERE
t1.registered BETWEEN '2013-05-01 00:00:00'
AND '2013-05-02 00:00:00'
GROUP BY
t1.registered,
t1.category_id,
q2.c2 ;
Output
REGISTERED CATEGORY_ID VALUE
---------------------------------------------
May, 01 2013 00:00:00+0000 1 5
May, 01 2013 01:00:00+0000 1 5
May, 01 2013 02:00:00+0000 2 6
May, 01 2013 07:00:00+0000 3 3
May, 01 2013 08:00:00+0000 5 3
May, 01 2013 09:00:00+0000 1 5
Query Plan
HashAggregate (cost=35.27..35.32 rows=4 width=20)
-> Hash Right Join (cost=33.63..35.23 rows=4 width=20)
Hash Cond: (t2.category_id = t1.category_id)
-> HashAggregate (cost=20.82..21.49 rows=67 width=4)
-> Bitmap Heap Scan on items t2 (cost=6.27..19.52 rows=260 width=4)
Recheck Cond: (registered < '2013-05-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on idx_reg (cost=0.00..6.20 rows=260 width=0)
Index Cond: (registered < '2013-05-01 00:00:00'::timestamp without time zone)
-> Hash (cost=12.76..12.76 rows=4 width=12)
-> Bitmap Heap Scan on items t1 (cost=4.29..12.76 rows=4 width=12)
Recheck Cond: ((registered >= '2013-05-01 00:00:00'::timestamp without time zone) AND (registered <= '2013-05-02 00:00:00'::timestamp without time zone))
-> Bitmap Index Scan on idx_reg (cost=0.00..4.29 rows=4 width=0)
Index Cond: ((registered >= '2013-05-01 00:00:00'::timestamp without time zone) AND (registered <= '2013-05-02 00:00:00'::timestamp without time zone))
For anyone interested, here is the sql fiddle page with sample data:
SQL-Fiddle-2
with items as (
select registered, category_id
from items
where
category_id is not null
and
registered < '2013-05-02'
), items_restricted as (
select registered, category_id
from items
where registered >= '2013-05-01'
), category_id as (
select distinct category_id
from items_restricted
)
select distinct ir.registered, ir.category_id, "value"
from
items_restricted ir
inner join
(
select
category_id, count(*) as value
from items
where exists (
select category_id
from category_id
where category_id = items.category_id
)
group by category_id
) s using(category_id)
order by registered, category_id