Simple query with a single join very slow

Posted on

Question :

I have this very slow, simple query that joins a large table (~180M rows) with a smaller table (~60k rows) with a foreign key, filtering an indexed column on the smaller table, ordering by the primary key in the larger table, and then taking the 25 latest rows.

The EXPLAIN shows Using index; Using temporary; Using filesort on the smaller table. Why?

Engine: MySQL 5.7.

Query:

SELECT
    order.id,
    order.company_id,
    order.total
FROM
    order
INNER JOIN
    company ON company.id = order.company_id
WHERE
    company.company_headquarter_id = 23133
ORDER BY order.id DESC
LIMIT 25;
+----+-------------+------------+------------+------+---------------------------------------+----------------------------+---------+-----------------------+------+----------+----------------------------------------------+
| id | select_type | table      | partitions | type | possible_keys                         | key                        | key_len | ref                   | rows | filtered | Extra                                        |
+----+-------------+------------+------------+------+---------------------------------------+----------------------------+---------+-----------------------+------+----------+----------------------------------------------+
|  1 | SIMPLE      | company    | NULL       | ref  | PRIMARY,company_headquarter_id_idx    | company_headquarter_id_idx | 8       | const                 |    6 |   100.00 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | order      | NULL       | ref  | company_id_idx                        | company_id_idx             | 8       | company.id            |  381 |   100.00 | NULL                                         |
+----+-------------+------------+------------+------+---------------------------------------+----------------------------+---------+-----------------------+------+----------+----------------------------------------------+
CREATE TABLE `order` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `company_id` bigint(20) NOT NULL,
  `total` double(18,2) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `company_id_idx` (`company_id`),
  CONSTRAINT `company_id_fk` FOREIGN KEY (`company_id`) REFERENCES `company` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=186518644 DEFAULT CHARSET=latin1

CREATE TABLE `company` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `company_headquarter_id` bigint(20) NOT NULL,
  `name` varchar(100) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `company_headquarter_id_idx` (`company_headquarter_id`),
  CONSTRAINT `company_headquarter_id_fk` FOREIGN KEY (`company_headquarter_id`) REFERENCES `company_headquarter` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=60825 DEFAULT CHARSET=latin1

CREATE TABLE `company_headquarter` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `name` varchar(100) NOT NULL,
  `phone` varchar(10) DEFAULT NULL,
  `address_id` bigint(20) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `name` (`name`),
  KEY `address_id_idx` (`address_id`),
  CONSTRAINT `address_id_fk` FOREIGN KEY (`address_id`) REFERENCES `address` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=43862 DEFAULT CHARSET=latin1

CREATE TABLE `address` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `street_address` varchar(100) DEFAULT NULL,
  `zip` varchar(7) DEFAULT NULL,
  `state` varchar(2) DEFAULT NULL,
  `city` varchar(50) DEFAULT NULL,
  `country` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=147360955 DEFAULT CHARSET=latin1

The query becomes faster when I:

  • Remove the ORDER BY clause.
  • Filter company.company_headquarter_id with a company_headquarter_id that has a smaller number of orders. (company_headquarter_id = 23133 has ~3M rows in the order table)
  • Split it into two separate queries:

First:

SELECT
    company.id
FROM
    company
WHERE
    company.company_headquarter_id = 23133;

Second:

SELECT
    order.id,
    order.company_id,
    order.total
FROM
    order
WHERE
    order.company_id IN (20122, 50729, 50730, 50731, 50732, 50733)  /* From first query */
ORDER BY order.id DESC
LIMIT 25;

Any ideas?

Thank you.

EDIT:

When I do:

SELECT STRAIGHT_JOIN
    order.id,
    order.company_id,
    order.total
FROM
    order
INNER JOIN
    company ON company.id = order.company_id
WHERE
    company.company_headquarter_id = 23133
ORDER BY order.id DESC
LIMIT 25;

The query is much faster and EXPLAIN shows a temporary table is not created.

Answer :

If you were to upgrade to at least MySQL 8.0, you’d be able to take advantage of lateral joins:

create index order_ordered on order (company_id, id desc);

SELECT
    lateral_o.id,
    lateral_o.company_id,
    lateral_o.total
FROM
    company c
LATERAL (select o.id
               ,o.company_id
               ,o.total
         from   order o
         where  c.id = o.company_id
         order by o.company_id, o.id desc
         limit 25
        ) lateral_o
WHERE
    c.company_headquarter_id = 23133
ORDER BY lateral_o.id DESC
LIMIT 25;

This will do what it looks like, get the top 25 rows for each company_id value you find using an index, then do a final sort at the end.

The current plans you have will either be: going through the order table in reverse id order and checking if the company belongs to your filter until it succeeds 25 times. OR, it will be going to every single matching order row, sorting and returning the 25 desired. The first choice is fine if the chance of an order being one you care about is high, but awful if it’s not. The second choice is great if there’s only a few matching order rows, but will get worse when there is lots. My suggestion’s performance will be related to how many company rows you identify, and it sounds like there won’t often be many.

Leave a Reply

Your email address will not be published. Required fields are marked *