How to inner join some tables and outer join others?

Posted on

Question :

I’m certain I’m going to be embarrassed by the eventual answer, but I’ve been googling around and banging my head against the wall and can’t figure it out.

I have four tables described below. I’m trying to get the query that shows all the products related to users (through the user_products table) AND all the categories.

Table definitions

CREATE TEMPORARY TABLE categories (
   id INT PRIMARY KEY NOT NULL
   , cat_name TEXT
 );

 CREATE TEMPORARY TABLE products (
   id INT PRIMARY KEY NOT NULL
   , category_id INT
   , prod_name TEXT
 );

 CREATE TEMPORARY TABLE users (
   id INT PRIMARY KEY NOT NULL
   , user_name TEXT
 );

 CREATE TEMPORARY TABLE user_products (
   id INT PRIMARY KEY NOT NULL
   , product_id INT
   , user_id INT
   , quantity INT
 );      

INSERT statements

INSERT INTO categories VALUES (1, 'cat1')
                             ,(2, 'cat2')
                             ,(3, 'cat3')
                             ,(4, 'cat4');

 INSERT INTO products VALUES (1, 1, 'prod1')
                            ,(2, 1, 'prod2')
                            ,(3, 2, 'prod3')
                            ,(4, 3, 'prod4')
                            ,(5, 4, 'prod4');

 INSERT INTO users VALUES (1, 'user1')
                         ,(2, 'user2')
                         ,(3, 'user3');

 INSERT INTO user_products VALUES (1, 1, 1, 4)
                                 ,(2, 2, 1, 3)
                                 ,(3, 3, 1, 3)
                                 ,(4, 2, 2, 3)
                                 ,(5, 2, 2, 2)
                                 ,(6, 5, 2, 2); 

I’ve tried several options, and this certainly feels like an outer join scenario. The most promising query THAT DOES NOT WORK is:

SELECT up.*, p.*, c.*, u.*
FROM user_products AS up
  JOIN users AS u ON u.id = up.user_id
  JOIN products AS p ON p.id = up.product_id
  RIGHT OUTER JOIN categories AS c ON c.id = p.category_id
WHERE u.id = 1 OR u.id IS NULL

Result from above query

 id | product_id | user_id | quantity | id | category_id | prod_name | id | cat_name | id | user_name 
----+------------+---------+----------+----+-------------+-----------+----+----------+----+-----------
  1 |          1 |       1 |        4 |  1 |           1 | prod1     |  1 | cat1     |  1 | user1
  2 |          2 |       1 |        3 |  2 |           1 | prod2     |  1 | cat1     |  1 | user1
  3 |          3 |       1 |        3 |  3 |           2 | prod3     |  2 | cat2     |  1 | user1
    |            |         |          |    |             |           |  3 | cat3     |    | 

The reason it does not work is that it fails to return ‘cat4’ from the categories table. Obviously this is because the WHERE clause is selecting it out of the result set. I’m at a loss to as to get a result set that includes all categories + the details of the products and quantities for which there are records in the specified user_products table.

The query should

  • return all categories
  • return data populated for user info iff the user has that product (a record exists in user_products)
  • not return rows with other user ids

Answer :

It sounds like you are looking for a FULL [OUTER] JOIN. Per documentation:

FULL OUTER JOIN

First, an inner join is performed. Then, for each row in T1 that does not satisfy the join condition with any row in T2, a joined row
is added with null values in columns of T2. Also, for each row of T2
that does not satisfy the join condition with any row in T1, a joined
row with null values in the columns of T1 is added.

SELECT *
FROM   users           u  
JOIN   user_products   up ON u.id = up.user_id
                         AND u.id = 1
JOIN   products        p ON p.id = up.product_id
FULL   JOIN categories c ON c.id = p.category_id;

This returns all categories and also returns all user-product combinations for the given user_id.

If you are enforcing referential integrity with foreign keys and products.category_id is defined NOT NULL, so that every product is assigned to an existing category (not in your question), you can replace the FULL JOIN with a RIGHT JOIN.

Plus, either way, the condition to select a specific user has to move to a subquery or (simpler) to a JOIN condition between the first three tables. It has to be applied before the last table categories is joined.

SQL Fiddle.

I believe you want to show all categories.

If so, you will need to make all of your joins RIGHT OUTER JOINS.

Why?

  1. Associativity and precedence for outer join operators is poorly defined and unpredictable
  2. You need to keep going ‘outer’ once you’ve started. Because you’re using a RIGHT join, each preceding join must also be a RIGHT join. If not, the NULLs in the outer table will never join out to product-user or user tables, and this will cut out your categories as well

Note you will likely see users repeat once you get this working, because users can be associated with multiple products in multiple categories.

Good luck!

Andrew Wolfe

Leave a Reply

Your email address will not be published. Required fields are marked *