Anyway to identify mysql slaves have read upto which binary logs

Posted on

Question :

I have multiple slaves reading from one master.. But few of them are keep disconnecting due to network issue.

Is there any way that i can make sure before setting up the bin log purging script on master that all the slave have read upto certain bin logs which we can purge?

Answer :

The easiest method is the following

  • Go to each Slave and run SHOW SLAVE STATUSG
  • Look for Relay_Master_Log_File on Each Slave
  • Whichever Slave has the oldest Relay_Master_Log_File is the one you purge to on the Master

Why Relay_Master_Log_File ? First Look at SHOW SLAVE STATUSG

mysql> show slave statusG
*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: 10.48.20.253
                Master_User: replicant
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: mysql-bin.000255
        Read_Master_Log_Pos: 858190247
             Relay_Log_File: relay-bin.066069
              Relay_Log_Pos: 873918
      Relay_Master_Log_File: mysql-bin.000254
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB:
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                 Last_Errno: 0
                 Last_Error:
               Skip_Counter: 0
        Exec_Master_Log_Pos: 858190247
            Relay_Log_Space: 873772
            Until_Condition: None
             Until_Log_File:
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File:
         Master_SSL_CA_Path:
            Master_SSL_Cert:
          Master_SSL_Cipher:
             Master_SSL_Key:
      Seconds_Behind_Master: 0
1 row in set (0.00 sec)

You see Master_Log_File and Relay_Master_Log_File

In this display they are different. So why choose Relay_Master_Log_File ?

  • Master_Log_File represents the binlog with the last binlog event downloaded to the Slave
  • Relay_Master_Log_File represents the binlog with the last binlog event executed on the Slave

In the event of any replication lag, Relay_Master_Log_File is always the oldest. If they are the same, fine. You choose Relay_Master_Log_File always.

In this case, you purge to mysql-bin.000254 on the Master. In the event the SQL thread dies, you don’t want to erase binlogs from the Master the Slave has not processed yet.

Getting back to multiple slaves, you choose the oldest Relay_Master_Log_File of all Slaves.

To clarify, the oldest Relay_Master_Log_File is the binary log name with the lowest number.

You should rather monitor your slaves for replication lag or if it’s actually catching-up / connected with master or not!

You may simply write a shell script to look at io_thread and sql_thread values and alert if they’re NO. There are already many scripts available for reference.

You can use Percona Monitoring Tools‘s pmp-check-mysql-replication-delay / pmp-check-mysql-replication-running. You may google for setup instructions to setup with Nagios or setup with Zabbix or this.

Btw, 7 days worth binary logs (with backups) are sufficient but again depends on policies!

If you are to avoid disconnects due to transient network issues, depending on which MySQL you’re on you may want to set combination of master-retry-count / slave-net-timeout / master-connect-retry. (but it’s better to fix network right?)

The first thing I would do is fix the network issue.
I would also set up pt-heartbeat to monitor Mysql slave delay.
Why do you need to purge the binlog on Master? If you miss the position of a binlog to purge you could affect all your slaves.

Mysql allows to set up binlog retentions.
If you are running out of space, because your binlog are too large, I would check if you can use SQL Statement binlog format or mixed. RAW does take more space.
If you have to use RAW, i would get more space. Master is the driver of all your transactions, you want to make sure it is healthy and have room to grow.

No way I think. But

1) You could just stop purging if any of the slaves are unreachable(ssh or mysql) from the server where you are running this purging script

2) If the connectivity issue is only with MySQL and not SSH, you could get the positions from master.info files of the slaves

Actually I follow this method

If you cannot afford more space, another option is to use : START SLAVE …. UNTIL
You would execute this command on ALL Slaves which will have them stop at exact same position. You would then check all slave to make sure they all stopped at same position. Then on Master you can run :

PURGE BINARY LOGS TO ‘mysql-bin.xxx’;
PURGE BINARY LOGS BEFORE ‘2015-08-27 00:00:00’; (easier to use, don’t need to find out position)

Leave a Reply

Your email address will not be published. Required fields are marked *