Question :
This is my table definition, without any special index for now:
CREATE TABLE `filter` (
`field_name` varchar(32) COLLATE utf8_unicode_ci NOT NULL,
`category_id` varchar(32) COLLATE utf8_unicode_ci NOT NULL,
`position` smallint(6) NOT NULL,
`options` longtext COLLATE utf8_unicode_ci COMMENT '(DC2Type:json_array)',
PRIMARY KEY (`field_name`,`category_id`),
KEY `IDX_702C956612469DE2` (`category_id`),
CONSTRAINT `FK_702C956612469DE2`
FOREIGN KEY (`category_id`) REFERENCES `category` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Sample data:
+------------+-----------------------+-------------+----------+
| field_name | options | category_id | position |
+------------+-----------------------+-------------+----------+
| color | {"label": "Color"} | 1895 | 1 |
| material | {"label": "Material"} | 1895 | 2 |
| color | {"label": "Color"} | 1896 | 1 |
| color | {"label": "Color"} | 1897 | 1 |
+------------+-----------------------+-------------+----------+
Most of the time I do a query with WHERE IN
and ORDER BY position
and selecting only field_name
and options
:
EXPLAIN SELECT
field_name
options
FROM
filter
WHERE
category_id IN ('1895', '1896', '1897')
ORDER BY
position ASC
This gives me Using index condition; Using filesort, quite bad uh?
{
"query_block":{
"select_id":1,
"ordering_operation":{
"using_filesort":true,
"table":{
"table_name":"filter",
"access_type":"range",
"possible_keys":[
"IDX_702C956612469DE2"
],
"key":"IDX_702C956612469DE2",
"used_key_parts":[
"category_id"
],
"key_length":"98",
"rows":5,
"filtered":100,
"index_condition":"(`filter`.`category_id` in ('1895','1896','1897'))"
}
}
}
}
With an index on category_id, position
Using where; Using index; Using filesort:
{
"query_block":{
"select_id":1,
"ordering_operation":{
"using_filesort":true,
"table":{
"table_name":"filter",
"access_type":"range",
"possible_keys":[
"IDX_702C956612469DE2",
"IDX_CATEGORY_POSITION"
],
"key":"IDX_CATEGORY_POSITION",
"used_key_parts":[
"category_id"
],
"key_length":"98",
"rows":5,
"filtered":100,
"using_index":true,
"attached_condition":"(`filter`.`category_id` in ('1895','1896','1897'))"
}
}
}
}
So which explain is better? I would say the first because of using index condition (there is no “where” part). Is there any way to avoid the filesort?
Answer :
Given the structure of multicolumn B-tree index it is not viable to do a sort on position
when using IN on category_id
. But as the test data suggest the position
is not “global” but seems to have a meaning only for given category_id
. So as I suggested in comments, it is instead possible to ORDER BY (category_id, position)
– that can use two-column index on those columns to get it without filesort. http://sqlfiddle.com/#!9/aabaa/7
Because of the longtext column it is not possible to turn this into an index-only scan (text columns can be only indexed on prefix, not on entire value, as index key length is limited).
Using varchar fields as primary key has some drawbacks, but it is not “wrong” by itself so just a little suggestion – would it make sense in your desing to use some integer ids instead?