Question :
If I have a query like this:
SELECT name FROM
(
SELECT * FROM `table`
WHERE id IN (1,40,300)
ORDER BY AnotherColumn
)
From a bandwidth and overhead standpoint, is that the same thing as doing three separate queries? Or is it significantly less overhead?
Answer :
The amount of data you want could be fast to send over a wire, but what is mysql doing to process it and prepare it for transamission ???
Let’s first look at the original query
SELECT name FROM
(
SELECT * FROM `table`
WHERE id IN (1,40,300)
ORDER BY AnotherColumn
)
If id is a primary key, this should go fast. Yet, you do not need to say get all columns in a subquery, and then read on the name from that subquery. You could just craft the SQL as
SELECT * FROM
(
SELECT name,AnotherColumn FROM `table`
WHERE id IN (1,40,300)
)
ORDER BY AnotherColumn
or even better, no subquery at all
SELECT name FROM `table`
WHERE id IN (1,40,300)
ORDER BY AnotherColumn
You could benchmark each of these and find negligible to nominal differences but nothing really to publish a Whitepaper over. If thousands of these queries were executed, a query digest program such as pt-query-digest or mk-query-digest could easily pick out a winner as to who is the fastest in terms of running time per call.
With regard to your original question, think of it in these terms:
If going to a movie costs $8/person and you had to bring three people to see a movie, which is cheaper ?
- CASE01) taking three people to see a movie
- CASE02) taking each person to see the movie at a different time
Well, do the math.
- CASE01) $32, you and the three people
- CASE02) $48, you (three times) and the three people
Obviously, 50% more overhead in CASE02.
If the puzzle involved
- 4 people, CASE01) $40 vs CASE02) $64
- 5 people, CASE01) $48 vs CASE02) $80
- 6 people, CASE01) $56 vs CASE02) $96
- 7 people, CASE01) $64 vs CASE02) $112
- N people, CASE01) 8(N+1) vs CASE02) 16N
As N goes to infinity, the average overhead is doubled.
Calling three queries in a subquery or getting three rows in one subquery sure beats three separate queries anyday.
It seems like MySQL allows ORDER BY in a subselect where it shouldn’t matter, like in this case. The outer SELECT doesn’t have an ORDER BY, so the results of the query aren’t guaranteed to be ordered.
What three SELECT statements are you expecting to emulate, Jakobud? I only see one:
SELECT name, AnotherColumn From Table01 WHERE id IN (1,40,300) ORDER BY AnotherColumn;
Indeed, you’d hope the optimizer would realize that only name and AnotherColumn are necessary to run the query, but I think that’s the actual question here: what does the MySQL optimizer do when shown this statement? That’ll depend on which storage engine is used for Table01, and which indexes are available on it. Ideally, this query has an index over (id, AnotherColumn, Name) so that the query is naturally satisfied from the index and won’t need to seek back to the table for additional columns or perform a sort operation to satisfy the ORDER BY — assuming the ORDER BY is honored.