MySQL on RDS, data transfer from one instance to another as a production job

Posted on

Question :

I have a MySQL RDS instance (“production”) from which I need to copy incremental data to another MySQL RDS instance (“reporting”). “Incremental data” being the new rows in certain tables. Currently I accomplish this via a PHP script which queries from “production” based on the last update time and writes on CSV files to disk and then I use mysqlimport.exe to import the files into “reporting”.

My current method takes ~80 seconds. Is this the most robust and efficient way to do this? I’ve found other tools for “one off” transfers, but nothing in terms of an “enterprise incremental database replication” tool.

Our current process is not a “continuous” replication because of batch operations on the reporting side. This may be negotiable though. Would easing that constraint open up tools that would make this easy?

Relevant related articles/questions:
Amazon’s article on importing data seems to mainly deal with the batch (initial load) case.

This can be done with a stored proc when moving from one DB to another on the same instance (the problem with my case is they are different instances). See: Keeping MySQL staging database on same server updated from production
I can’t put both DBs on the same instance because our instances are under quite a bit of load.

Answer :

MySQL RDS allows you to do two options


You set up a Read Slave. Then, run the reports from it. There is a tool in the RDS CLI called rds-create-db-instance-read-replica to create a Read Slave from a Running RDS instance. The following are done under the hood:

  • Snapshot Made
  • Snapshot Brought Up
  • CHANGE MASTER TO executed with proper binary log and position


Take a live snapshot of Production. Then, open a new instance using that snapshot.

The tools for this are

  • rds-create-db-snapshot
  • rds-restore-db-instance-from-db-snapshot


The only difference between these options

  • the Read Slave is continuous
  • Snaphot brought up as a point-in-time Instance

In both cases, once you done, simply drop the new Instance (saves money) at your discretion.

If you combine the two ideas, you could load your PHP script’s output into the point-time instance. Any needed data to retrieve from production could be read from the Read Slave to prevent further load issues with production.


Bringing up instances can be time-consuming. So, plan to have the instance brought up early enough for report and import time.

This is more ranting and brainstorming than a concrete answer.

Write your own bash script and schedule Incremental Load Job to sync Prod and reporting server/DWH DB,

Step 1: Does a mysqldump on Prod (without lock) to fetch records from tables (Where updated_at between Previous_Day_Start and Previous_Day_End)

Step2: Restore the backed up SQL file into reporting/dwh instance, (Delta DB – create your staging DB here)

Step3: Does an SP call to sync records in reporting/dwh delta DB to actual reporting/dwh DB (both DBs are in the same instance)

SP Logic: First do the update by matching primary key, then do an insert of new records.

Script Commands:
#(1)Do the mysql database backup (dump)
mysqldump -h$vHost1 -u$vUser -p$vPass1 –skip-comments –quick –extended-insert –set-gtid-purged=OFF –skip-triggers –skip-add-locks $vDb1 $vTable –where=”updated_at between ‘$vStartDate’ and ‘$vEndDate'” > $vFile

#(2) DO mysql restore
mysql -h$vHost2 -u$vUser -p$vPass2 –comments –database=$vDb2 < $vFile

#(3) SP call
mysql -h$vHost2 -u$vUser -p$vPass2 –skip-column-names –database=$vDb2 –execute=”CALL your_sp();” >> “$vDailyIncrementalLoad/$vFileName”

Warning: mysqldump is costly operation, so run the job in non-business hours on a nightly basis, Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *