How to insert values in junction table for many to many relationships (MySQL)?

Posted on

Question :

I’m new to relational database design and I’ve hit a stumbling block.

I have three tables…

PROJECTS
project_id, name, ...

PROJECTS_UNDERGROUND
projects_FK, underground_FK

(I have both FKs setup as a composite key to eleminate duplicate data)

UNDERGROUND
underground_id, underground_name


Both PROJECTS and UNDERGROUND have a columns and rows of data already populated, such as id’s and names but at present, there is no connection with the Junction table, they’re currently sitting unconnected.
I’m puzzled to the queries I have to write to link the projects with underground station names?
The concept of composite keys make sense, but I can’t seem to fiqure out the queries to update the PROJECTS_UNDERGROUND table.
Thanks for your help, everyone.

Answer :

Knowing a name, you can easily get the matching ID by filtering by the name. For the PROJECTS table, it would look like this:

SELECT
  project_id
FROM
  PROJECTS
WHERE
  name = @project_name
;

Same for UNDERGROUND:

SELECT
  underground_id
FROM
  UNDERGROUND
WHERE
  underground_name = @underground_name
;

Each query returns the result as a row set. The two row sets can be combined into a single row set if you cross-join them:

SELECT
  p.project_id,
  u.underground_id
FROM
  (
    SELECT
      project_id
    FROM
      PROJECTS
    WHERE
      name = @project_name
  ) AS p
CROSS JOIN
  (
    SELECT
      underground_id
    FROM
      UNDERGROUND
    WHERE
      underground_name = @underground_name
  ) AS u
;

And now you just need to insert the result returned by the above query into PROJECTS_UNDERGROUND:

INSERT INTO
  PROJECTS_UNDERGROUND (project_FK, underground_FK)
SELECT
  p.project_id,
  u.underground_id
FROM
  ... /* same query as previously */

It is also possible to rewrite the SELECT part like this:

SELECT
  p.project_id,
  u.underground_id
FROM
  PROJECTS AS p
CROSS JOIN
  UNDERGROUND AS u
WHERE
  name = @project_name
AND
  underground_name = @underground_name
;

Logically, this simpler looking query prescribes the server to first create a cartesian product of the two tables and then filter the result set by the condition in the WHERE clause. If that was indeed how the server would execute the query, it would probably not work very fast. However, it is likely that MySQL can figure out the possibility to push down the conditions in such a simple case, thus effectively changing the logical order of execution, and come up with an efficient execution plan. I invite you to test and compare both solutions.

Leave a Reply

Your email address will not be published. Required fields are marked *