Question :
I have a slow query running on a large forum. The EXPLAIN synopsis:
mysql> EXPLAIN
SELECT
COUNT(post_thanks.id) as thanked,
post_thanks.postid,
post.title,
post.dateline
FROM
post_thanks
INNER JOIN post ON post_thanks.postid = post.postid
WHERE
post.threadid = 562942
AND post.parentid != 0
GROUP BY post_thanks.postid
ORDER BY
COUNT(post_thanks.id) DESC,
dateline DESC LIMIT 5
;
+----+-------------+-------------+------+---------------------------------+------------+---------+-------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+------+---------------------------------+------------+---------+-------------------+------+----------------------------------------------+
| 1 | SIMPLE | post | ref | PRIMARY,idx_parentid,idx_ksours | idx_ksours | 4 | const | 7826 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | post_thanks | ref | postid | postid | 4 | forum.post.postid | 1 | Using where; Using index |
+----+-------------+-------------+------+---------------------------------+------------+---------+-------------------+------+----------------------------------------------+
The indexes on the table:
mysql> SHOW INDEXES FROM post;
+-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| post | 0 | PRIMARY | 1 | postid | A | 32547435 | NULL | NULL | | BTREE | | |
| post | 1 | ipaddress | 1 | ipaddress | A | 6509487 | NULL | NULL | | BTREE | | |
| post | 1 | userid | 1 | userid | A | 1017107 | NULL | NULL | | BTREE | | |
| post | 1 | userid | 2 | dateline | A | 32547435 | NULL | NULL | | BTREE | | |
| post | 1 | idx_dateline | 1 | dateline | A | 32547435 | NULL | NULL | | BTREE | | |
| post | 1 | idx_ipaddress | 1 | ipaddress | A | 6509487 | NULL | NULL | | BTREE | | |
| post | 1 | idx_parentid | 1 | parentid | A | 16273717 | NULL | NULL | | BTREE | | |
| post | 1 | idx_ksours | 1 | threadid | A | 1713022 | NULL | NULL | | BTREE | | |
| post | 1 | idx_ksours | 2 | visible | A | 1713022 | NULL | NULL | | BTREE | | |
| post | 1 | idx_ksours | 3 | dateline | A | 32547435 | NULL | NULL | | BTREE | | |
| post | 1 | idx_ksours | 4 | userid | A | 32547435 | NULL | NULL | | BTREE | | |
| post | 1 | idx_ksours | 5 | postid | A | 32547435 | NULL | NULL | | BTREE | | |
| post | 1 | idx_ksours | 6 | attach | A | 32547435 | NULL | NULL | | BTREE | | |
+-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
13 rows in set (0.01 sec)
I’ve been trying to figure out why it is not using the index from the post table.
mysql> SHOW INDEXES FROM post_thanks;
+-------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| post_thanks | 0 | PRIMARY | 1 | id | A | 7177648 | NULL | NULL | | BTREE | | |
| post_thanks | 1 | postid | 1 | postid | A | 7177648 | NULL | NULL | | BTREE | | |
| post_thanks | 1 | idx_userid | 1 | userid | A | 797516 | NULL | NULL | | BTREE | | |
| post_thanks | 1 | idx_userid | 2 | date | A | 7177648 | NULL | NULL | | BTREE | | |
+-------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.00 sec)
Does anyone have ideas as to why it is not using the index(es), or how I can attack this problem to make the query faster? Even if I remove the parentid portion, it still does not use the idx_ksours index.
* Warning, I’m not a DBA
Answer :
Your query actually is using the idx_ksours index.
It seems like a pretty sensible assumption that Using index
in the Extra
field means the query is using the index, and then when that isn’t shown in Extra
then the index isn’t being used… but that’s actually not what that means.
Indexes, as you likely know, contain copies of the indexed column(s) data, in sorted order, along with a copy of the primary key of the row so the row referenced by the index can be retrieved.
When all of the data the query needs to examine in a table is available inside one index, the optimizer realizes that reading the actual row data is unnecessary — everything needed is actually contained in the index, and will be read from the index — eliminating the extra lookup to retrieve the associated row from the full table. This is what Using index
means.
On the other hand, if the optimizer will using an index to optimize the query, the name of that index is shown in key
column of EXPLAIN SELECT
.
+------------+
| key |
+------------+
| idx_ksours |
| postid |
+------------+
So, you’re actually using idx_ksours.
That’s the good news. The bad news, of course, is that this index doesn’t offer enough of an optimization for your entire query.
Note that the second column in idx_ksours
is visible
, which is not evaluated by anything in the join condition or where clause, which means none of the other columns in that index are not used for this query — it’s only helping you find rows WHERE post.threadid = 562942.
The comment from @dezso addresses this — if you had an index with all-relevant columns, in a helpful order, that index would provide more benefit for the query.