Optimizing ORDER BY for simple MySQL query

Posted on

Question :

I’m trying to optimize this really simple query, and it gives me grief for a day now 🙁

Seems pretty straightforward, I need to select with a JOIN from 2 tables, and get top X results sorted in a particular order. Here’s the query:

SELECT * FROM `po` 
INNER JOIN po_suppliers s ON po.supplier_id = s.id
ORDER BY po.id ASC
LIMIT 10

However, it runs really slow (half a second to 2 seconds). Here are table structures:

CREATE TABLE `po` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `supplier_id` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `supplier_id` (`supplier_id`)
) ENGINE=InnoDB AUTO_INCREMENT=457790 DEFAULT CHARSET=latin1

CREATE TABLE `po_suppliers` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=9386 DEFAULT CHARSET=latin1

Running EXPLAIN reveals the issue:

+----+-------------+-------+-------+---------------+-------------+---------+--------------+------+----------------------------------------------+
| id | select_type | table | type  | possible_keys | key         | key_len | ref          | rows | Extra                                        |
+----+-------------+-------+-------+---------------+-------------+---------+--------------+------+----------------------------------------------+
|  1 | SIMPLE      | s     | index | PRIMARY       | PRIMARY     | 4       | NULL         |  480 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | po    | ref   | supplier_id   | supplier_id | 4       | sergiis.s.id |    1 | Using index                                  |
+----+-------------+-------+-------+---------------+-------------+---------+--------------+------+----------------------------------------------+

Can you help me figure out how I can optimize this to run faster? I have index on the column I sort by. I have index on columns I join by. If I remove the JOIN – it’s crazy fast. If I remove ORDER BY, it’s crazy fast. Why am I getting this dreaded temporary + filesort?

Answer :

The EXPLAIN SELECT you posted definitely seems counter-intuitive.

If your query included WHERE s.id = ... then the query plan you’re seeing might make a little bit more sense, but I’m assuming you’re not.

It looks like the optimizer is getting distracted by the facts that supplier is a smaller table and that the supplier_id index in the po table can be used as a covering index… and with those facts in hand, it’s overlooking the seemingly-obvious fact that the tables should be read in the opposite order than the one it chooses.

Here are two alternatives.

— use the STRAIGHT_JOIN directive to insist that the optimizer process the tables in only the listed order:

SELECT STRAIGHT_JOIN * FROM `po` 
INNER JOIN po_suppliers s ON po.supplier_id = s.id
ORDER BY po.id ASC
LIMIT 10;

— use the FORCE KEY index hint to direct the optimizer to prefer the primary key of the po table:

SELECT * FROM `po` FORCE KEY (PRIMARY) 
INNER JOIN po_suppliers s ON po.supplier_id = s.id
ORDER BY po.id ASC
LIMIT 10;

The first option is probably the better option, since FORCE KEY, in spite of the name, is still only a “hint” that the optimizer can choose to ignore, while STRAIGHT_JOIN genuinely does force the hand of the optimizer to join the tables in the order they’re listed.

Leave a Reply

Your email address will not be published. Required fields are marked *