What are the pros and cons of having more data-bearing members in a MongoDB replica set? [closed]

Posted on

Question :

A 3 member replica set has the same fault tolerance if all 3 members contain a copy of the data or if one is an arbiter so why would I ever want all 3 members to have a copy of the data?

What is a situation where all members having a copy of the data makes sense?

Answer :

There are a few key aspects to fault tolerance in a replica set:

  • write availability: a majority of voting replica set members is required to maintain or elect a primary
  • data redundancy: writes should be acknowledged by multiple members of a replica (ideally a majority, to avoid rollbacks).

A three member Primary-Secondary-Arbiter (PSA) replica set supports the first aspect: if any single member of the replica set is unavailable, a primary can still be maintained or elected.

However, an arbiter does not support the second aspect of fault tolerance since it does not store any data.

When a PSA configuration is degraded to P_A (one secondary unavailable), there are some significant operational consequences:

  • Until you have a second data-bearing member online, data redundancy is compromised and writes can only be acknowledged by the current primary.
  • You no longer have active replication. If your secondary is offline for too long, it may fall off the oplog and need to be re-synced.
  • Applications and internal processes can no longer use a majority write concern. If you haven’t planned for this, writes requesting majority acknowledgement may block until a majority of data bearing members are available or a timeout is reached (if set for the write command). If your replica set is part of a sharded cluster, this can also prevent successful chunk migrations between shards.
  • Applications and internal processes will not see the newest data on the primary using a majority read concern. Some features (for example, change streams in MongoDB 3.6+) rely on reading majority committed data to avoid the chance that changes may be rolled back. If the majority commit point cannot be advanced, there will also be increased pressure on the WiredTiger cache.

The above consequences can be avoided by having a Primary-Secondary-Secondary (PSS) configuration.

It is worth keeping in mind that fault tolerance doesn’t only apply to failure modes. There are normal maintenance tasks such as software or hardware upgrades that may require a member to be briefly unavailable.

What is a situation all members having a copy of the data makes sense?

If you are running a three member replica set in production with MongoDB 3.6 or newer (or any version of MongoDB where this replica set is backing a shard): always.

If you are running an older version of MongoDB in production: always, unless you are prepared to accept the above operational consequences.

Leave a Reply

Your email address will not be published. Required fields are marked *