arbiter bring down the three node mongodb replica set [duplicate]

Posted on

Question :

I have a typical 3 node mongodb replica set. One primary, one secondary and one arbiter. I thought the replica set would survive one node offline but it didn’t. Please help me understand what went wrong.

Here is what I observed. Are they supposed to work like this?

  • When I take the arbiter offline (killing the instance), both two data nodes become secondary (they were primary and secondary).

  • When I take the primary offline (killing the instance), the secondary and arbiter remains secondary and arbiter. There is no voting to select the new primary.

Thanks!

Here is the config

{
  "_id": "0",
  "version": 8,
  "protocolVersion": NumberLong(1),
  "writeConcernMajorityJournalDefault": true,
  "members": [
    {
      "_id": 3,
      "host": ":27017",
      "arbiterOnly": true,
      "buildIndexes": true,
      "hidden": false,
      "priority": 0,
      "tags": {

      },
      "slaveDelay": NumberLong(0),
      "votes": 1
    },
    {
      "_id": 4,
      "host": ":27017",
      "arbiterOnly": false,
      "buildIndexes": true,
      "hidden": false,
      "priority": 0,
      "tags": {

      },
      "slaveDelay": NumberLong(0),
      "votes": 0
    },
    {
      "_id": 5,
      "host": ":27017",
      "arbiterOnly": false,
      "buildIndexes": true,
      "hidden": false,
      "priority": 3,
      "tags": {

      },
      "slaveDelay": NumberLong(0),
      "votes": 1
    }
  ],
  "settings": {
    "chainingAllowed": true,
    "heartbeatIntervalMillis": 2000,
    "heartbeatTimeoutSecs": 10,
    "electionTimeoutMillis": 10000,
    "catchUpTimeoutMillis": 60000,
    "catchUpTakeoverDelayMillis": 30000,
    "getLastErrorModes": {

    },
    "getLastErrorDefaults": {
      "w": 1,
      "wtimeout": 0
    },
    "replicaSetId": ObjectId("")
  }
}

Answer :

As per MongoDB documentation Find the Replica Set Elections

Voting Members

The replica set member configuration setting members[n].votes and member state determine whether a member votes in an election.

All replica set members that have their members[n].votes setting equal to 1 vote in elections. To exclude a member from voting in an election, change the value of the member’s members[n].votes configuration to 0.

  • Changed in version 3.2: Non-voting members must have priority
    of 0.

  • Members with priority
    greater than 0 cannot have 0 votes.

Only voting members in the following states are eligible to vote:

  • PRIMARY
  • SECONDARY
  • STARTUP2
  • RECOVERING
  • ARBITER
  • ROLLBACK

A non-voting member has both votes and priority equal to 0:

{
   "_id" : <num>,
   "host" : <hostname:port>,
   "arbiterOnly" : false,
   "buildIndexes" : true,
   "hidden" : false,
   "priority" : 0,
   "tags" : {

   },
   "slaveDelay" : NumberLong(0),
   "votes" : 0
}

Note : Do not alter the number of votes to control which members will become primary. Instead, modify the members[n].priority option.
Only alter the number of votes in exceptional cases. For example, to
permit more than seven members.

When I take the arbiter offline (killing the instance), both two data
nodes become secondary (they were primary and secondary).

In this case, when the arbiter(_id = 3) goes down, The member with _id equals to 4 has Priority and votes as 0, so it cannot become primary and cannot vote in the election and the Majority(2 votes) cannot be established. So both the nodes becomes secondary.

When I take the primary offline (killing the instance), the secondary
and arbiter remain secondary and arbiter. There is no voting to select
the new primary.

In this case, when the Primary(_id = 5) goes down, The member with _id equals to 4 has Priority, so it cannot become primary. Because of this the secondary
and arbiter remains as secondary and arbiter.

How to design this replica set to work better?

Make the Priority and votes of the node(_id = 4) to 1, it will fix both the scenarios.

Leave a Reply

Your email address will not be published. Required fields are marked *