I am running MySQL 5.6.41 on AWS RDS. I have one multi-AZ master and one multi-AZ read replica. It is a vanilla configuration MySQL configuration, so the binlog is in
MIXED format and the replication process on the replica is single threaded.
When I attach a new replica, it is booted with ~1 hour of lag, and it takes 10 hours to catch up. When I turn the binlog format to
ROW on the replica, the lag starts increasing indefinitely if
I have tracked this down to the
SQL_THREAD in particular on the replica that is struggling to keep up.
I’m trying to work out why replication is so slow, given that no system resources appear to be saturated.
The write IOPS on the master does not exceed 2,500/s during short-lived spikes. The average is more like 470/s. Read IOPS are very low: no more than 100/s. We have a baseline capacity of 2,700/s IOPS because we have a 900GiB General Purpose SSD. See AWS docs on this. I also don’t think the disk is the bottleneck because the DiskQueueDepth doesn’t exceed 1.
I don’t believe the issue is saturating the network because the
IO_THREAD on the replica is keeping up with the master.
The CPU and memory usage on the replica seem ok. CPU doesn’t go above 8% utilization on an 8 vCPU instance, and it has 8 GiB freeable memory.
Does anyone have any ideas for how I can track this down further?
Single-threading is very likely to be the cause of getting “behind”.
A Master can execute multiple transactions at the same time. But a single-threaded Slave is limited to one transaction at a time. This is the old, simple, way of doing Replication.
Newer replication techniques are more complex. Depending on what version of MySQL you are running, multi-threading may be limited to “one thread per database”, thereby not involving much extra complexity.
(I, too, need at least A and B, plus RAM size.)
Find some of the slowest DML queries; there may be ways to speed them up. Also, do you have a lot of Reads on the Replica? If so, it could be adding to the sluggishness.
- Version: 5.6.41-log
- 30 GB of RAM
- Uptime = 6d 07:36:37
- You are not running on Windows.
- Running 64-bit version
- You appear to be running entirely (or mostly) InnoDB.
The More Important Issues:
innodb_io_capacity, currently 200, to 1000. (Check AWS documentation on this topic.)
Assuming you have SSDs, turn off
Is some other process purging binlogs? (cf
expire_logs_days = 0)
Ignoring errors seems dangerous. (cf
The traffic heavily leaning toward INSERTs; is this normal? What do the INSERTs look like? Single-row inserts? Multi-row? Other? We may be able to tweak them to avoid some of the slave lag.
Related to that, please discuss the apparent 275 transactions per second.
Since the main thrust of the Question is about slave lag, let’s also approach this from the viewpoint of “slow queries”. Set
long_query_time=1 and turn on the Slow Log. (AWS may have a web page laying out details.) Do this for both the Master and the slave. I prefer these ancillary settings, especially for the “slave lag” situation:
log_queries_not_using_indexes = OFF log_slow_admin_statements = ON log_slow_slave_statements = ON
Details and other observations:
( innodb_lru_scan_depth ) = 1,024
— “InnoDB: page_cleaner: 1000ms intended loop took …” may be fixed by lowering lru_scan_depth
( (Innodb_buffer_pool_reads + Innodb_buffer_pool_pages_flushed) ) = ((3078624 + 150107660) ) / 545797 = 280 /sec — InnoDB I/O
— Increase innodb_buffer_pool_size (now 22548578304)?
( Innodb_buffer_pool_pages_flushed ) = 150,107,660 / 545797 = 275 /sec — Writes (flushes)
— Increase innodb_buffer_pool_size (now 22548578304)?
( Innodb_os_log_written ) = 318,630,569,472 / 545797 = 583789 /sec — This is an indicator of how busy InnoDB is.
— Very busy InnoDB.
( Innodb_log_writes ) = 150,500,027 / 545797 = 275 /sec
( Innodb_os_log_written / (Uptime / 3600) / innodb_log_files_in_group / innodb_log_file_size ) = 318,630,569,472 / (545797 / 3600) / 2 / 128M = 7.83 — Ratio
— (see minutes)
( Uptime / 60 * innodb_log_file_size / Innodb_os_log_written ) = 545,797 / 60 * 128M / 318630569472 = 3.83 — Minutes between InnoDB log rotations Beginning with 5.6.8, this can be changed dynamically; be sure to also change my.cnf.
— (The recommendation of 60 minutes between rotations is somewhat arbitrary.) Adjust innodb_log_file_size (now 134217728). (Cannot change in AWS.)
( Innodb_dblwr_writes ) = 6,573,796 / 545797 = 12 /sec — “Doublewrite buffer” writes to disk. “Doublewrites” are a reliability feature. Some newer versions / configurations don’t need them.
— (Symptom of other issues)
( innodb_flush_neighbors ) = 1 — A minor optimization when writing blocks to disk.
— Use 0 for SSD drives; 1 for HDD.
( ( Innodb_pages_read + Innodb_pages_written ) / Uptime / innodb_io_capacity ) = ( 4568524 + 150107660 ) / 545797 / 200 = 141.7% — If > 100%, need more io_capacity.
— Increase innodb_io_capacity (now 200) if the drives can handle it.
( innodb_io_capacity ) = 200 — I/O ops per second capable on disk . 100 for slow drives; 200 for spinning drives; 1000-2000 for SSDs; multiply by RAID factor.
( sync_binlog ) = 0 — Use 1 for added security, at some cost of I/O =1 may lead to lots of “query end”; =0 may lead to “binlog at impossible position” and lose transactions in a crash, but is faster.
( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF — Whether to log all Deadlocks.
— If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.
( max_connections ) = 2,496 — Maximum number of connections (threads). Impacts various allocations.
— If max_connections (now 2496) is too high and various memory settings are high, you could run out of RAM.
( character_set_server ) = character_set_server = latin1
— Charset problems may be helped by setting character_set_server (now latin1) to utf8mb4. That is the future default. Caution: there can be index definition problems with utf8mb4 (but utf8 does not have these problems).
( local_infile ) = local_infile = ON
— local_infile (now ON) = ON is a potential security issue
( (Queries-Questions)/Queries ) = (303268906-1169727)/303268906 = 99.6% — Fraction of queries that are inside Stored Routines.
— (Not bad if high; but it impacts the validity of some other conclusions.)
( Created_tmp_disk_tables ) = 570,444 / 545797 = 1 /sec — Frequency of creating disk “temp” tables as part of complex SELECTs
— increase tmp_table_size (now 16777216) and max_heap_table_size (now 16777216).
Check the rules for temp tables on when MEMORY is used instead of MyISAM. Perhaps minor schema or query changes can avoid MyISAM.
Better indexes and reformulation of queries are more likely to help.
( Created_tmp_disk_tables / Questions ) = 570,444 / 1169727 = 48.8% — Pct of queries that needed on-disk tmp table.
— Better indexes / No blobs / etc.
( Created_tmp_disk_tables / Created_tmp_tables ) = 570,444 / 729780 = 78.2% — Percent of temp tables that spilled to disk
— Maybe increase tmp_table_size (now 16777216) and max_heap_table_size (now 16777216); improve indexes; avoid blobs, etc.
( (Com_insert + Com_update + Com_delete + Com_replace) / Com_commit ) = (150283122 + 50516 + 10949 + 0) / 150328440 = 1 — Statements per Commit (assuming all InnoDB)
— Low: Might help to group queries together in transactions; High: long transactions strain various things.
( Select_scan ) = 778,039 / 545797 = 1.4 /sec — full table scans
— Add indexes / optimize queries (unless they are tiny tables)
( Select_scan / Com_select ) = 778,039 / 1129934 = 68.9% — % of selects doing full table scan. (May be fooled by Stored Routines.)
— Add indexes / optimize queries
( Com_insert + Com_delete + Com_delete_multi + Com_replace + Com_update + Com_update_multi ) = (150283122 + 10949 + 0 + 0 + 50516 + 0) / 545797 = 275 /sec — writes/sec
— 50 writes/sec + log flushes will probably max out I/O write capacity of normal drives
( binlog_error_action ) = binlog_error_action = IGNORE_ERROR — What to do if the binlog cannot be written.
— IGNORE_ERROR is the default for backwards compatibility, but ABORT_SERVER is recommended.
( expire_logs_days ) = 0 — How soon to automatically purge binlog (after this many days)
— Too large (or zero) = consumes disk space; too small = need to respond quickly to network/machine crash.
(Not relevant if log_bin (now ON) = OFF)
( innodb_autoinc_lock_mode ) = 1 — Galera: desires 2 — 2 = “interleaved”; 1 = “consecutive” is typical; 0 = “traditional”.
— Galera desires 2; 2 requires BINLOG_FORMAT=ROW or MIXED
( slow_query_log ) = slow_query_log = OFF — Whether to log slow queries. (5.1.12)
( long_query_time ) = 10 — Cutoff (Seconds) for defining a “slow” query.
— Suggest 2
( log_slow_slave_statements ) = log_slow_slave_statements = OFF — (5.6.11, 5.7.1) By default, replicated statements won’t show up in the slowlog; this causes them to show.
— It can be helpful in the slowlog to see writes that could be interfering with Slave reads.
( Aborted_connects / Connections ) = 2,404 / 4995 = 48.1% — Perhaps a hacker is trying to break in? (Attempts to connect)
( thread_cache_size / Max_used_connections ) = 32 / 17 = 188.2%
— There is no advantage in having the thread cache bigger than your likely number of connections. Wasting space is the disadvantage.
You have the Query Cache half-off. You should set both query_cache_type = OFF and query_cache_size = 0 . There is (according to a rumor) a ‘bug’ in the QC code that leaves some code on unless you turn off both of those settings.
(Com_select + Qcache_hits) / (Com_insert + Com_update + Com_delete + Com_replace) = 0.00752
Binlog_cache_use = 275 /sec Com_begin = 275 /sec Com_commit = 275 /sec Com_commit + Com_rollback = 275 /sec Com_flush = 12 /HR Com_insert = 275 /sec Com_kill = 3 /HR Com_purge = 12 /HR Com_show_grants = 0.66 /HR Com_show_slave_status = 60 /HR Com_slave_start = 0.0066 /HR Connection_errors_peer_address = 0.0066 /HR Handler_commit = 1377 /sec Handler_commit/Questions = 642 Handler_delete = 330 /sec Handler_prepare = 1101 /sec Innodb_buffer_pool_pages_misc = 135,294 Innodb_buffer_pool_pages_misc * 16384 / innodb_buffer_pool_size = 9.8% Innodb_buffer_pool_write_requests = 9230 /sec Innodb_data_fsyncs = 300 /sec Innodb_data_writes = 563 /sec Innodb_data_writes - Innodb_log_writes - Innodb_dblwr_writes = 275 /sec Innodb_data_written = 9595901 /sec Innodb_dblwr_pages_written = 275 /sec Innodb_os_log_fsyncs = 276 /sec Innodb_os_log_pending_fsyncs = 1 Innodb_os_log_written / (Uptime / 3600) / innodb_log_files_in_group = 1,002.1MB Innodb_pages_written = 275 /sec Innodb_rows_deleted = 330 /sec Innodb_rows_deleted + Innodb_rows_inserted = 669 /sec Prepared_stmt_count = 4 Ssl_accepts = 1,019 Ssl_finished_accepts = 1,019 Ssl_session_cache_overflows = 340 Ssl_used_session_cache_entries = 128 performance_schema_max_cond_instances = 12,884 performance_schema_max_socket_instances = 5,012 performance_schema_max_thread_instances = 5,092
Slave_running = ON innodb_fast_shutdown = 1 log_output = TABLE optimizer_trace = enabled=off,one_line=off optimizer_trace_features = greedy_search=on, range_optimizer=on, dynamic_range=on, repeated_subselect=on read_only = ON relay_log_recovery = ON slave_rows_search_algorithms = TABLE_SCAN,INDEX_SCAN time_zone = UTC
Suggestions to consider for your AWS RDS Parameters Group
innodb_lru_scan_depth=100 # from 1024 to reduce 90% of CPU cycles used for function innodb_io_capacity=1900 # from 200 to use more of SSD capacity innodb_log_buffer_size=1G # from 8M to support ~ 30 minutes of log activity in RAM innodb_log_file_size=4G # from 128M to minimize log rotation
These changes will significantly reduce CPU BUSY and are just a tip of the iceberg.
The last two will require the assistance of AWS RDS support team.
For additional assistance, view my profile, Network profile where you will find free Utility Scripts to assist with performance tuning.
What is the reason for FLUSH 12 times every hour, average?