Fastest way to move Collection from Sharded MongoDB Cluster to a another Sharded MongoDB Cluster

Posted on

Question :

I have to move a MongoDB Collection (Size: 1TB+) from Sharded MongoDB Cluster to another[5 shards on both clusters]. During this operation, it is assumed that the accompanying program will be offline, so no need to worry about inconsistent data, but I need to minimize the downtime.

I’ve tried testing it with mongodump and mongorestore, but looks like that is taking ages to finish.

Please share your experiences for such kind of scenarios.

Note: Cluster has only one collection, so I am open at cluster level sync as well.

Answer :

As per MongoDB documentation Create Chunks in a Sharded Cluster

If you want to ingest a large volume of data into a cluster that is
unbalanced, or where the ingestion of data will lead to data
imbalance, such as with monotonically increasing or decreasing shard
keys. Pre-splitting the chunks of an empty sharded collection can help
with the throughput in these cases.

EXAMPLE

To create chunks for documents in the myapp.users collection using the email field as the shard key, use the following operation in the mongo shell:

for ( var x=97; x<97+26; x++ ){
    for ( var y=97; y<97+26; y+=6 ) {
        var prefix = String.fromCharCode(x) + String.fromCharCode(y);
        db.adminCommand( { split: "myapp.users", middle: { email : prefix } } );
    }
}

This assumes a collection size of 100 million documents.

Leave a Reply

Your email address will not be published. Required fields are marked *