I am in the process of planning a MongoDB deployment, this will be my first to production and I have a question regarding the disk size of each shard in a cluster.
This cluster itself will be for storing of time series building sensor data with the potential to accumulate 2TB of data every year quite soon after deployment.
So, I plan to start with a 2 shard cluster (2 x 3 node replica sets) since I then have the query routers and config servers in place from the get go. However I could do with some advice regarding how I go about choosing how much data I should store per shard.
Each ‘shard node’ will have a minimum of 48GB RAM and obviously the right disk configuration to meet the IOPS requirement.
If I am able to satisfy the IOPS requirement of the applications using this DB, what is to stop me sizing for 2TB per shard? Is there a limit on the volume of data a single shard should hold or guidelines to help my decision making process?
I am reading a lot about performance issues if the data volume exceeds the available RAM per host. But if the disks provide enough IOPS’s surely this isn’t a problem? I appreciate disks will still be far slower than memory but if MongoDB performs poorly when you exceed the RAM size then how to people deal with large databases? The cost to keep adding small shards to a cluster in order to stay within RAM is huge!
In short, to keep cluster expansion to a minimum, if I can satisfy my IOPS requirement, am I able to safely store any amount of data on a single shard or is there a much lower recommendation and if so, for what reason?
Also, I know i must try and keep my index size below my RAM size to ensure efficient query execution. Here is an example:
If my data volume is 1TB per shard and I have 48GB RAM on each of the
shard nodes, is there a way to estimate index size? The working set is
hard to estimate since this is a data logging system which will update
anything upto the total point count of data entering the database
every minute, which might be 30,000 documents, inserted once, then
updated with minute data for the whole day, then a new 30,000
documents the following day, then updated etc….
You’ll need to increase the chunk size (default:64MB), otherwise you’ll be limited there.
General limit information from MongoDB :
The MMAPv1 storage engine limits each database to no more than 16000 data files. This means that a single MMAPv1 database has a
maximum size of 32TB. Setting the storage.mmapv1.smallFiles option
reduces this limit to 8TB.
Changed in version 3.0.
Using the MMAPv1 storage engine, a single mongod instance cannot manage a data set that exceeds maximum virtual memory address
space provided by the underlying operating system.
Virtual Memory Limitations :
Linux: 64 terabytes (journaled) – 128 terabytes (not journaled)
Windows Server 2012 R2/Windows 8.1: 64 terabytes (journaled) – 128 terabytes (not journaled)
Windows (otherwise): 4 terabytes (journaled) – 8 terabytes (not journaled)
(The WiredTiger storage engine is not subject to this limitation.)
It’s not an easy task to plan the dimensions of shards, and there is no fool or errorproof way of doing it.
What I tend to do is to create a certain fraction of the expected data per shard as dummy data. If I have a shard handy, I fill it completely.
In the next step, I generate the same fraction of the expected load on a matching number of front-end servers.
Now, I analyse the memory consumption of the indices and overall. Knowing the index size, you roughly know the minimum RAM you need. In order to have enough warning time and a decent amount of RAM for connections, the working set and other processes, I round up to the next sensible value for RAM sizes. This way you make sure your RAM does not get filled up easily, you have a decent warning time when you have too scale out because of insufficient RAM and your application won’t be throttled by your database.
Let me illustrate that. Given my average object size is slightly under 1kB and I plan to have a 1TB partition, but I only have a 256GB partition for tests, I’d create somewhere in the order of 250 million dummy documents. Now, let’s say we plan to have 8 front-end servers for the application, I’d set up two of them.
Next, I’d use a load testing tool (there are many, and they are out of scope of this answer) to generate load on those two front-end servers.
After you have done that, you should have a quite accurate indication of overall memory consumption, index size and working set size.
Let’s say we find out that our indices are 20GB in size and 300 connections are made. So, let’s just right say 21GB. So our shard should have four times this RAM, rounded up to the next sensible value (keep in mind cost efficiency!) So, let’s assume that would be 96GB.
Now, you should find out wether it makes more sense to have two shards instead of one, each with 48GB of RAM and 500GB of disk space for MongoDB, respectively. This may be the case, because bigger SSDs can be disproportionately expensive.
I think you get the picture: There is no easy formula, you simply have to do your homework 😉 My shortish might or might not work for you. If it doesn’t, you have to work one out yourself.
Oh, and whatever you do when planning a shard: plan with SSDs. It’s worth it.