Optimize IN clause in where query with order by – MySQL

Posted on

Question :

I am trying to optimize a query that using IN clause in WHERE to avoid file sorting. To make it easy , I created the following sample which shows the problem. Here is my query:

SELECT * 
FROM `test` 
WHERE user_id = 9898 
AND status IN (1,3,4) 
order by id 
limit 30;

Here is the result of explain, as you can see the query is filesort

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  test    range   user_id     user_id     8   NULL    3   Using where; Using index; Using filesort

Here is my table structure

CREATE TABLE IF NOT EXISTS `test` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int(10) unsigned NOT NULL,
  `status` int(3) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  KEY `user_id` (`user_id`,`status`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=7 ;

--
-- Dumping data for table `test`
--

INSERT INTO `test` (`id`, `user_id`, `status`) VALUES
(5, 9797, 2),
(6, 9797, 3),
(4, 9898, 0),
(1, 9898, 2),
(2, 9898, 3),
(3, 9898, 4);

How can I optimize the query? In my real table I can see the following information in error log:
# Query_time: 26.498180 Lock_time: 0.000175 Rows_sent: 100 Rows_examined: 4926

Answer :

I was looking at a very similar problem today. After doing a ton of searching online, I found this great article by Percona.

Using “UNION ALL”, you can join together a list of queries, like so:

(SELECT * FROM `test` WHERE user_id = 9898 AND status = 1 ORDER BY id LIMIT 30) 
UNION ALL
(SELECT * FROM `test` WHERE user_id = 9898 AND status = 3 ORDER BY id LIMIT 30)
UNION ALL
(SELECT * FROM `test` WHERE user_id = 9898 AND status = 4 ORDER BY id LIMIT 30)
ORDER BY id LIMIT 30

It looks gnarly compared to select ... status IN (1,3,4), but it’s effective in avoiding the filesort.

As long as the inner “SELECT…” statements are efficient, the performance is good.

Put the list of values into a temporary table, and then perform an INNER JOIN to it. Most SQL optimizers and engines can handle joins much better than they can handle an IN operation. If the list is long, this also allows you to define an index on the temporary table to further assist the optimizer

Leave a Reply

Your email address will not be published.