I have multiple slaves reading from one master.. But few of them are keep disconnecting due to network issue.
Is there any way that i can make sure before setting up the bin log purging script on master that all the slave have read upto certain bin logs which we can purge?
The easiest method is the following
- Go to each Slave and run
SHOW SLAVE STATUSG
- Look for
Relay_Master_Log_Fileon Each Slave
- Whichever Slave has the oldest
Relay_Master_Log_Fileis the one you purge to on the Master
Relay_Master_Log_File ? First Look at
SHOW SLAVE STATUSG
mysql> show slave statusG *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.48.20.253 Master_User: replicant Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000255 Read_Master_Log_Pos: 858190247 Relay_Log_File: relay-bin.066069 Relay_Log_Pos: 873918 Relay_Master_Log_File: mysql-bin.000254 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 858190247 Relay_Log_Space: 873772 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 1 row in set (0.00 sec)
In this display they are different. So why choose
Master_Log_Filerepresents the binlog with the last binlog event downloaded to the Slave
Relay_Master_Log_Filerepresents the binlog with the last binlog event executed on the Slave
In the event of any replication lag,
Relay_Master_Log_File is always the oldest. If they are the same, fine. You choose
In this case, you purge to
mysql-bin.000254 on the Master. In the event the SQL thread dies, you don’t want to erase binlogs from the Master the Slave has not processed yet.
Getting back to multiple slaves, you choose the oldest
Relay_Master_Log_File of all Slaves.
To clarify, the oldest
Relay_Master_Log_File is the binary log name with the lowest number.
You should rather monitor your slaves for replication lag or if it’s actually catching-up / connected with master or not!
You can use Percona Monitoring Tools‘s pmp-check-mysql-replication-delay / pmp-check-mysql-replication-running. You may google for setup instructions to setup with Nagios or setup with Zabbix or this.
Btw, 7 days worth binary logs (with backups) are sufficient but again depends on policies!
If you are to avoid disconnects due to transient network issues, depending on which MySQL you’re on you may want to set combination of master-retry-count / slave-net-timeout / master-connect-retry. (but it’s better to fix network right?)
The first thing I would do is fix the network issue.
I would also set up pt-heartbeat to monitor Mysql slave delay.
Why do you need to purge the binlog on Master? If you miss the position of a binlog to purge you could affect all your slaves.
Mysql allows to set up binlog retentions.
If you are running out of space, because your binlog are too large, I would check if you can use SQL Statement binlog format or mixed. RAW does take more space.
If you have to use RAW, i would get more space. Master is the driver of all your transactions, you want to make sure it is healthy and have room to grow.
No way I think. But
1) You could just stop purging if any of the slaves are unreachable(ssh or mysql) from the server where you are running this purging script
2) If the connectivity issue is only with MySQL and not SSH, you could get the positions from master.info files of the slaves
Actually I follow this method
If you cannot afford more space, another option is to use : START SLAVE …. UNTIL
You would execute this command on ALL Slaves which will have them stop at exact same position. You would then check all slave to make sure they all stopped at same position. Then on Master you can run :
PURGE BINARY LOGS TO ‘mysql-bin.xxx’;
PURGE BINARY LOGS BEFORE ‘2015-08-27 00:00:00’; (easier to use, don’t need to find out position)