I have a staging DB server on which functionality is tested before pushing it to the production environment. Now, we want to start doing VA/PT (Vulnerability Analysis and Penetration Testing) on our application, but since that can be destructive on the staging DB server, we want to make a separate Testing environment with a VAPT web+DB server.
The data from the staging DB server must be replicated onto the VAPT DB server automatically so that specific new use cases, features, etc can be tested for vulnerabilities.
Due to VAPT activities (testing data, SQL Injection, possibly DROP TABLE exploits, etc) the VAPT DB server itself will also have its own data changes, i.e. divergence from Staging DB (Master)
So, if I use simple Master-Slave replication as below I am assured of #1:
Staging DB (Master) -> VAPT DB (Slave)
But if I do #2, the slave will eventually diverge, which is fine for the testing environment, but, will it interrupt or mess with the Master-Slave replication as per #1?
An obvious example where divergence will cause errors is a VA/PT activity that causes
DROP TABLE users so that the Staging DB (Master)
users table trying to INSERT/UPDATE data will cause replication errors. Some UPDATEs/DELETEs might cause errors too.
If I use ROW-based replication divergence will happen quickly causing frequent errors.
If I use STATEMENT-based replication, since ids will not match, it is possible that some data will break because ids are essential to link data in related tables even though we do not use foreign keys.
Alternatively, instead of replication, I could manually dump the Staging DB into the VAPT DB daily, which would be cumbersome to automate.
I could make copy DBs and setup various partial copy operations, but that would complicate matters too much, given that I am not a developer and that my developers often make and revert changes of various sizes randomly.
EDIT: The data directory varies between 20-25 GB on Staging
Surely someone has come across this problem in their work so far and there might be a set of best practices for this situation
i.e. maintaining a match between staging and testing environments in real-time while allowing testing freedom to play with the data.
I tried googling for a while but the right phrasing for google escapes me. All I get is howtos for master slave replication, handling unwanted drift/divergence and so on. Nothing much about desired/accepted drift and divergence or partial replication.
Thanks in advance.
Given the situation that you’ve described, any form of replication will fail spectacularly within minutes of a rigorous pen test. Having a completely isolated database instance for this purpose alone is the most logical solution.
Here are a couple of reasons why:
- as you said, someone issues a
DROP TABLEand, because there’s no RI in your database to prevent this and because it sounds like the application is using a user account with full privileges, the command is carried out. Replication won’t “fix this” for you.
- Someone creates a new record on the replicated instance. Are all primary keys GUIDs? If so, you might be lucky enough to never run into a primary key collision but if you’re using incremental integers, there will be overlap (even if the replica is configured to use a higher range of numbers)
- Someone deletes a record
- Someone adds a column, then that same column is later added to the main server
- Someone truncates a table
- And on and on and on …
Automated restores of your production database should already be something that the company is doing as part of its backup verification process, so I don’t see how this would be cumbersome or difficult to automate. Every operating system under the sun can handle a little script that takes a recent backup and restores it to a database. Unless your system is already many hundreds of gigabytes in size, this would be a very simple and cost effective way forward.
As an added bonus, this script would make refreshing the development database(s) an absolute breeze