MongoDB replica set – no election after primary node failed

Posted on

Question :

For testing I killed the primary node of my two-node replica set while the cluster was under heavy write load.

I was expecting the secondary node would take over after a few seconds, but this isn’t happening.

shard0:SECONDARY> rs.status()
{
    "set" : "shard0",
    "date" : ISODate("2014-10-22T15:11:52Z"),
    "myState" : 2,
    "members" : [
        {
            "_id" : 1,
            "name" : "10.128.42.177:27020",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 10270,
            "optime" : Timestamp(1413990107, 52),
            "optimeDate" : ISODate("2014-10-22T15:01:47Z"),
            "self" : true
        },
        {
            "_id" : 2,
            "name" : "10.128.42.188:27020",
            "health" : 0,
            "state" : 8,
            "stateStr" : "(not reachable/healthy)",
            "uptime" : 0,
            "optime" : Timestamp(1413990103, 49),
            "optimeDate" : ISODate("2014-10-22T15:01:43Z"),
            "lastHeartbeat" : ISODate("2014-10-22T15:11:50Z"),
            "lastHeartbeatRecv" : ISODate("2014-10-22T15:01:43Z"),
            "pingMs" : 0,
            "syncingTo" : "10.128.42.177:27020"
        }
    ],
    "ok" : 1
}

As you can see at the time of printing rs status the last heartbeat received was already 10 minutes old.

What am I doing wrong?

Answer :

You are missing a node, either another data bearing node or an arbiter.

Here is why. A replica set needs a quorum > 50% of the original replica set members. Additionally, if a single node could decide to become primary, every network partitioning, every failing switch would result in a split brain situation, where two nodes assume they are primary.

Add an arbiter. They are cheap in terms of CPU, RAM and disk space, so you can run it on a small VM. Run it with

mongod --replSet shard0 --smallfiles --nojournal --noprealloc 

which further decreases the need of space.

Do NOT run the arbiter on one of the other nodes – it would leverage it’s purpose.

Leave a Reply

Your email address will not be published. Required fields are marked *