Is it better to put filters on JOIN clauses instead of WHERE clauses

Posted on

Question :

MySQL 5.5.8:

Question 1:
Is there any difference in the following statements and if so, which is preferred performance wise? Can this even be answered without knowing the row counts of the tables before and after the where clause and possible index usage?

SELECT *
FROM a JOIN b ON a.id = b.id AND a.col2 BETWEEN 1 AND 5;

SELECT *
FROM a JOIN b ON a.id = b.id
WHERE a.col2 BETWEEN 1 AND 5;

My reason for asking is because if for instance the tables are massive but the where clause filters 99.9999% of the records out, I don’t want to waste time joining massive tables if after the join 99.9999% of the records are not used.

Question 2:

Take the same situation, which statement (if any difference) requires less IOPS. (index present on b.id and one on a.col2)

Answer :

With any good query planner (and IIRC mysql’s is good enough in this regard) for queries where all joins are inner joins (as your’s is here, unless otherwise specified joins default to inner joins), it does not matter whether you put the filtering or joining comparisons in an ON clause or in the WHERE.

In the presence of outer joins more complicated queries this is a lot less clear cut as the position of the filtering clause can (but does not always) have a significant effect depending on the input data.

To convince yourself, generate a large data-set and try both to see if there is any difference. As a learning exercise to get a good feel for what the optimisations in your chosen DB engine can and can’t do, very little beats some well thought out experimenting, monitoring and benchmarking.

Leave a Reply

Your email address will not be published. Required fields are marked *