Question :
I’ve got some large MySQL (5.5.60) tables where queries take 60 seconds+ to complete.
product_base (1.5GB, 26 million rows)
CREATE TABLE `product_base` ( `EAN13` varchar(13) NOT NULL DEFAULT '', `company_tk` int(11) DEFAULT NULL, `category` int(11) DEFAULT NULL, `modified` varchar(30) DEFAULT NULL, KEY `EAN13` (`EAN13`), KEY `company_tk` (`company_tk`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
attribute (11.8GB, 126 million rows)
CREATE TABLE `attribute` ( `EAN13` varchar(13) DEFAULT NULL, `attribute_type` int(11) DEFAULT NULL, `val_t` varchar(1500) DEFAULT NULL, `val_n` decimal(15,4) DEFAULT NULL, `val_d` datetime DEFAULT NULL, `modified` datetime DEFAULT NULL, KEY `EAN13` (`EAN13`), KEY `attribute_type` (`attribute_type`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
attribute_type (16KB, 59 rows)
CREATE TABLE `attribute_type` ( `attribute_type` int(11) DEFAULT NULL, `title` varchar(90) DEFAULT NULL, `field_name` varchar(90) DEFAULT NULL, `data_type` char(3) DEFAULT NULL, `seq` int(11) NOT NULL, `html` int(11) DEFAULT NULL, KEY `attribute_type` (`attribute_type`), KEY `seq` (`seq`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
Example slow query:
Query_time: 61.629684 Lock_time: 0.000138 Rows_sent: 1 Rows_examined: 28141979 select p.EAN13, p.company_tk, a.attribute_type, a.val_t, t.title, t.field_name from product_base p inner join attribute a on p.EAN13=a.EAN13 inner join attribute_type t on a.attribute_type=t.attribute_type where p.EAN13=705632109762 and (a.attribute_type=1 or a.attribute_type=3) order by t.seq;
Running explain
on that query:
+----+-------------+-------+-------+----------------------+----------------+---------+--------------------------+---------+-----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+----------------------+----------------+---------+--------------------------+---------+-----------------------------+ | 1 | SIMPLE | t | range | attribute_type | attribute_type | 5 | NULL | 2 | Using where; Using filesort | | 1 | SIMPLE | a | ref | EAN13,attribute_type | attribute_type | 5 | eandata.t.attribute_type | 6299937 | Using where | | 1 | SIMPLE | p | ref | EAN13 | EAN13 | 54 | eandata.a.EAN13 | 1 | | +----+-------------+-------+-------+----------------------+----------------+---------+--------------------------+---------+-----------------------------+
Server/OS:
Debian 8 64bit | 24GB RAM (MySQL is using 70% of this) | 60GB SSD | 2xIntel Xeon CPU E5506 @ 2.13GHz | MySQL 5.5.60
MySQL my.cnf
# * Fine Tuning key_buffer = 16M max_allowed_packet = 16M thread_stack = 192K thread_cache_size = 8 # This replaces the startup script and checks MyISAM tables if needed # the first time they are touched myisam-recover-options = BACKUP max_connections = 20 table_cache = 9552 thread_cache = 10 sort_buffer_size = 512K net_buffer_length = 8K read_buffer_size = 256K read_rnd_buffer_size = 512K myisam_sort_buffer_size = 16M query_cache_size = 512M join_buffer_size = 2G innodb_buffer_pool_size = 15G innodb_io_capacity = 2000 innodb_read_io_threads = 64 innodb_thread_concurrency = 0 innodb_write_io_threads = 64 #innodb_log_file_size = 512M innodb_flush_log_at_trx_commit = 1 innodb_flush_method = O_DIRECT innodb_buffer_pool_instances = 8 max_heap_table_size = 2G tmp_table_size = 2G wait_timeout = 120 interactive_timeout = 120
Any suggestions for the slow queries?
Answer :
I’m answering my own question here. The other answers have some great optimisation tips, but the key to getting my queries down from 60s to 100ms was to treat the EAN13 field as a string in the where statement.
As an example:
where p.EAN13=705632109762
became where p.EAN13='705632109762'
I would try using UNION except for OR condition.
You could try:
( select p.EAN13, p.company_tk, a.attribute_type, a.val_t, t.title, t.field_name, t.seq from product_base p
inner join attribute a on p.EAN13=a.EAN13
inner join attribute_type t on a.attribute_type=t.attribute_type
where p.EAN13=705632109762 and a.attribute_type=1 )
union all
( select p.EAN13, p.company_tk, a.attribute_type, a.val_t, t.title, t.field_name, t.seq from product_base p
inner join attribute a on p.EAN13=a.EAN13
inner join attribute_type t on a.attribute_type=t.attribute_type
where p.EAN13=705632109762 and a.attribute_type=3 )
order by 7;
Maybe MySQL optimizer will choose better execution plan.
-
You do InnoDB a disservice by not providing an explicit
PRIMARY KEY
. Look at the data to see if an existing column or combination of columns is unique. -
Perhaps
EAN13
could be the PK ofproduct_base
? -
Perhaps
(EAN13, attribute_type)
could be the PK forattribute
? -
In one table
modified
is a string type, in another it isDATETIME
. ?? Anyway, do you use the column? If not removing it would make the table smaller, thereby possibly leading to a faster query. -
As already noted, the type inconsistency of
EAN13
led to a serious inefficiency. This is because VARCHAR = int-constant does not optimize well. Either change the column to beBIGINT
(if appropriate) or add quotes around the constant (to make it a string). -
These 3 tables smell like an “Entity-Attribute-Value” schema, which can usually be implemented with fewer tables — hence more efficiently. Is it EAV?
-
You “can’t tune your way out of a performance problem”. But if you also provided
SHOW GLOBAL STATUS;
after being up at least a day, I will review theVARIABLES
you already provided. -
These are dangerously high; don’t make them more than 1% of RAM:
max_heap_table_size = 2G
andtmp_table_size = 2G
.
To understand the ways to optimize let me to transform Your query to cartesian form (all joins are inner one, so it’s correct) by replacing JOINs with commas and moving ON conditions to WHERE section:
SELECT p.EAN13, p.company_tk, a.attribute_type, a.val_t, t.title, t.field_name
FROM product_base p, attribute a, attribute_type t
WHERE p.EAN13=705632109762
AND p.EAN13=a.EAN13
AND a.attribute_type IN (1,3)
AND a.attribute_type=t.attribute_type
ORDER BY t.seq
Do You see? No conditional join of product_base needed, the condition p.EAN13=a.EAN13
can be replaced with constant one a.EAN13=705632109762
. And we obtain:
SELECT p.EAN13, p.company_tk, a.attribute_type, a.val_t, t.title, t.field_name
FROM product_base p, attribute a, attribute_type t
WHERE p.EAN13=705632109762
AND a.EAN13=705632109762
AND a.attribute_type IN (1,3)
AND a.attribute_type=t.attribute_type
ORDER BY t.seq
There is the only condition on product_base
table, and EAN13
is already indexed, so nothing on that table can be optimised.
There is 2 conditions on attribute
table. To optimize, You need to create composite index
CREATE INDEX idx_EAN_at ON attribute (EAN13, attribute_type)
If nevertheless the attribute (attribute_type)
index will be used, add IGNORE INDEX
hint. Moreover, check if You really needs it, maybe it can be deleted without consequences. And creation of attribute (EAN13, attribute_type)
index makes the attribute (EAN13)
index unnecessary.
There is only one joining condition plus ordering on attribute_type
table. But this table is small, so I think no optimization needed, existent indices are sufficient for optimization.
Suggestion for your my.cnf-ini [mysqld] section
innodb_log_buffer_size=8M
SHOULD NEVER BE larger than
innodb_log_file_size=5M
takes special handling to correct with 5.5 and after you get to 5.6 you may
innodb_lru_scan_depth=128 # from 1024 to conserve CPU seconds every SECOND.
because you have innodb_buffer_pool_instances at 8.
Good luck.
For additional assistance, check my profile, clk Network Profile for contact info.