PostgreSQL requires “GROUP BY” when aliasing a table?

Posted on

Question :

Here’s something strange: there is a large table named “products” which is partitioned with inheritance such that there are two children: products_active and products_inactive, and the constraint is that in products_active, status=1, and products_inactive gets all the other statuses.

There is a large query which joins a bunch of tables, of which this is the first part:

SELECT, products.status, products.brand_id,, products.description,,,
  products.packaging, products.container, products.country_of_origin, products.category_id, products.product_type_id, products.tsv_keywords,
  COUNT( as prices_count,
  ROUND(AVG(currency_convert(prices.amount,currencies.currency_code,'USD')),2) as avg_price,
  ROUND(MAX(currency_convert(prices.amount,currencies.currency_code,'USD')),2) as high_price,
  ROUND(MIN(currency_convert(prices.amount,currencies.currency_code,'USD')),2) as low_price,
  ts_rank(tsv_keywords, plainto_tsquery('merlot')) as rank,
  ROUND(AVG(ST_Distance(ST_GeographyFromText('SRID=4326;POINT(0.001 0.001)'),ST_GeographyFromText('SRID=4326;POINT(' || stores.longitude || ' ' || stores.latitude || ')')))) AS distance
  JOIN product_types ON products.product_type_id =
  JOIN categories ON products.category_id =
  JOIN prices ON prices.product_id = 
  JOIN currencies ON prices.currency_id =
  JOIN stores ON prices.store_id =
  JOIN brands ON products.brand_id =
  JOIN merchants ON stores.merchant_id =
  JOIN manufacturers ON brands.manufacturer_id =
, delivery_zones

(full query here:

The problem is: note the FROM products... part, if I replace this with FROM products_active AS products, the query errors out with:

ERROR:  column "products.status" must appear in the GROUP BY clause or be used in an aggregate function
LINE 2:, products.status, products.brand_id, products....

The two tables (products and products_active) have identical structures! They are inherited from one another!

Am I missing a subtlety in aliasing table names?

Answer :

The problem is that the 2 tables are not identical.

The first has (id) as the PRIMARY KEY while the second has (id) a UNIQUE constraint (or index, doesn’t matter). This might seem like a minor detail but it isn’t.

Postgres has added in version 9.1, a feature were functionally dependent columns need not be mentioned in a GROUP BY clause (if the column they depend on is mentioned in the GROUP BY) and still used in the HAVING, SELECT and ORDER BY clauses. However the implementation is not 100% complete. It doesn’t identify all possible functional dependencies but only those coming from PRIMARY KEY constraints. While a UNIQUE constraint with all columns NOT NULL does not differ in any meaningful way from a PRIMARY KEY constraint, the implementation does not consider UNIQUE constraints.

So, for example, with these two very similar tables (test in SQLfiddle):

  ( id INT NOT NULL,
    name TEXT NOT NULL,
    status TEXT NOT NULL,
    PRIMARY KEY (id)
  ) ;

CREATE TABLE products_active
  ( id INT NOT NULL,
    name TEXT NOT NULL,
    status TEXT NOT NULL,
    UNIQUE (id)
  ) ;

The query will succeed for the first table:

SELECT, p.status,
FROM products AS p

while it will fail for the second:

SELECT, p.status,
FROM products_active AS p


ERROR: column “p.status” must appear in the GROUP BY clause or be used in an aggregate function

As one of the commenters said: the original query should have had a “GROUP BY” and PostgreSQL will accept a query without it in certain circumstances (see Why do wildcards in GROUP BY statements not work?). This, combined with the information that UNIQUE indexes are not the same as PRIMARY KEYs (PK’s disallow NULLs), solved this question.

Leave a Reply

Your email address will not be published. Required fields are marked *