Here’s something strange: there is a large table named “products” which is partitioned with inheritance such that there are two children: products_active and products_inactive, and the constraint is that in products_active,
status=1, and products_inactive gets all the other statuses.
There is a large query which joins a bunch of tables, of which this is the first part:
SELECT products.id, products.status, products.brand_id, products.name, products.description, products.data, products.website, products.packaging, products.container, products.country_of_origin, products.category_id, products.product_type_id, products.tsv_keywords, COUNT(prices.id) as prices_count, ROUND(AVG(currency_convert(prices.amount,currencies.currency_code,'USD')),2) as avg_price, ROUND(MAX(currency_convert(prices.amount,currencies.currency_code,'USD')),2) as high_price, ROUND(MIN(currency_convert(prices.amount,currencies.currency_code,'USD')),2) as low_price, ts_rank(tsv_keywords, plainto_tsquery('merlot')) as rank, ROUND(AVG(ST_Distance(ST_GeographyFromText('SRID=4326;POINT(0.001 0.001)'),ST_GeographyFromText('SRID=4326;POINT(' || stores.longitude || ' ' || stores.latitude || ')')))) AS distance FROM products JOIN product_types ON products.product_type_id = product_types.id JOIN categories ON products.category_id = categories.id JOIN prices ON prices.product_id = products.id JOIN currencies ON prices.currency_id = currencies.id JOIN stores ON prices.store_id = stores.id JOIN brands ON products.brand_id = brands.id JOIN merchants ON stores.merchant_id = merchants.id JOIN manufacturers ON brands.manufacturer_id = manufacturers.id , delivery_zones WHERE ...
(full query here: http://pastebin.com/VjJPTQWj)
The problem is: note the
FROM products... part, if I replace this with
FROM products_active AS products, the query errors out with:
ERROR: column "products.status" must appear in the GROUP BY clause or be used in an aggregate function LINE 2: products.id, products.status, products.brand_id, products....
The two tables (products and products_active) have identical structures! They are inherited from one another!
Am I missing a subtlety in aliasing table names?
The problem is that the 2 tables are not identical.
The first has
(id) as the
PRIMARY KEY while the second has
UNIQUE constraint (or index, doesn’t matter). This might seem like a minor detail but it isn’t.
Postgres has added in version 9.1, a feature were functionally dependent columns need not be mentioned in a
GROUP BY clause (if the column they depend on is mentioned in the
GROUP BY) and still used in the
ORDER BY clauses. However the implementation is not 100% complete. It doesn’t identify all possible functional dependencies but only those coming from
PRIMARY KEY constraints. While a
UNIQUE constraint with all columns
NOT NULL does not differ in any meaningful way from a
PRIMARY KEY constraint, the implementation does not consider
So, for example, with these two very similar tables (test in SQLfiddle):
CREATE TABLE products ( id INT NOT NULL, name TEXT NOT NULL, status TEXT NOT NULL, PRIMARY KEY (id) ) ; CREATE TABLE products_active ( id INT NOT NULL, name TEXT NOT NULL, status TEXT NOT NULL, UNIQUE (id) ) ;
The query will succeed for the first table:
SELECT p.id, p.status, p.name FROM products AS p GROUP BY p.id ;
while it will fail for the second:
SELECT p.id, p.status, p.name FROM products_active AS p GROUP BY p.id ;
ERROR: column “p.status” must appear in the GROUP BY clause or be used in an aggregate function
As one of the commenters said: the original query should have had a “GROUP BY” and PostgreSQL will accept a query without it in certain circumstances (see Why do wildcards in GROUP BY statements not work?). This, combined with the information that UNIQUE indexes are not the same as PRIMARY KEYs (PK’s disallow NULLs), solved this question.