Can a transitive join loop infinitely?

Posted on

Question :

I have a table that stores alternative product:

+--------------+-------------+------+-----+---------+----------------+
| Field        | Type        | Null | Key | Default | Extra          |
+--------------+-------------+------+-----+---------+----------------+
| alternate_id | bigint(20)  | NO   | PRI | NULL    | auto_increment |
| company_id   | varchar(36) | NO   | MUL | NULL    |                |
| product1     | bigint(20)  | NO   | MUL | NULL    |                |
| product2     | bigint(20)  | NO   | MUL | NULL    |                |
+--------------+-------------+------+-----+---------+----------------+

I’m trying to find products that are equal transitively, like so:

enter image description here

In the above I’d want to query all of the products that are alternates for A (which should be B, C, and D), so:

SELECT ... FROM table
-- some cool join
WHERE product1 = A

Is this even possible with SQL?

Answer :

Try something like (MySQL 8+ version needed):

WITH RECURSIVE 
products AS ( SELECT product1 p1, product2 p2
              FROM test
            UNION
              SELECT product2 p1, product1 p2
              FROM test
            UNION
              SELECT p.p1, t.product2
              FROM products p, test t
              WHERE p.p2 = t.product1
            UNION
              SELECT p.p1, t.product1
              FROM products p, test t
              WHERE p.p2 = t.product2
           )
SELECT p1 product, GROUP_CONCAT(p2 ORDER BY p2) alternatives
FROM products 
GROUP BY p1
ORDER BY p1;

fiddle

On MySQL 5.x the task can be solved in the Stored Procedure form (emulate CTE using temporary table, loop until records count not altered).

Draft code (must be debugged):

CREATE PROCEDURE GetAlternatives ()
BEGIN
  DROP TEMPORARY TABLE IF EXISTS products;
  CREATE TEMPORARY TABLE products (p1 BIGINT, p2 BIGINT, UNIQUE (p1, p2)) Engine = MEMORY;
  INSERT INTO products (p1, p2)
      SELECT product1 p1, product2 p2
      FROM test
    UNION
      SELECT product2 p1, product1 p2
      FROM test;
  SELECT COUNT(*) INTO @cnt1 FROM products;
  LOOP
    INSERT IGNORE INTO products (p1, p2)
        SELECT p.p1, t.product2
        FROM products p, test t
        WHERE p.p2 = t.product1
      UNION
        SELECT p.p1, t.product1
        FROM products p, test t
        WHERE p.p2 = t.product2;
    SELECT COUNT(*) INTO @cnt2 FROM products;
    IF @cnt1 = @cnt2 THEN
      LEAVE LOOP;
    END IF;
    SET @cnt2 := @cnt1;
  END LOOP;
  SELECT p1 product, GROUP_CONCAT(p2 ORDER BY p2) alternatives
  FROM products 
  GROUP BY p1
  ORDER BY p1;
END

But I recommend you to alter your DB structure and create “product group” entity. So the table structure will be altered to (company_id, product_id, product_group_id), where product_group_id is a reference to the product_group table.

The above code shows how to deal with your request in both directions.
First, it declares the DDL in memory tables, then, shows the results in both directions, yes, it was not requested but I wanted to check myself it this was possible. Then, makes a UNION to select records from both directions, forward and backwards:

WITH CTE1 AS -- FORWARD DIRECTION
(
    SELECT product1, product2 FROM TestData WHERE [product1] = 'A'
    UNION ALL
    SELECT A.product1, A.product2
        FROM TestData A
        INNER JOIN CTE1
            ON CTE1.product2 = A.product1
),
CTE2 AS     -- BACWARDS DIRECTION
(
    SELECT product1, product2 FROM TestData WHERE [product1] = 'A'
    UNION ALL
    SELECT A.product1, A.product2
        FROM TestData A
        INNER JOIN CTE2
            ON CTE2.product1 = A.product2
)
SELECT * FROM CTE1
UNION
SELECT * FROM CTE2 ;

Leave a Reply

Your email address will not be published. Required fields are marked *