[MongoDB]: Why we need odd number nodes in RS
The voting is done by a majority of voting members.
Imagine a Replica Set with three (voting) members. Let’s say that Node A is primary, and nodes B+C are secondaries. Node A goes down, so nodes B+C go to election. They still do form a majority (two out of three). The election is first decided by priority. If both Nodes B & C have the same priority, then the one who is most up to date in respect to the failed primary (oplog) wins. Let’s say it’s Node B.
Once node A comes back alive, there is no new election. Node B remains the master, and C+A are now secondaries.
On the other hand, if two nodes go down you don’t have a majority, so the replica set can’t accept updates (apply writes) any more until at least one of the two failing servers becomes alive (and connected by the single surviving node) again.
Imagine now a Replica Set with four (voting) members. Let’s say that Node A is primary, and nodes B+C+D are secondaries. Node A goes down, so nodes B+C+D go to election. They of course form majority (three out of four)
However, if two nodes go down you don’t have a majority (two out of four), so the replica set is again at read only mode.
So that’s why an odd number is recommended; If you loose a single member in a 3 members replica set, it’s the same as loosing a single member in a 4 members replica set: you still gain quorum majority and a new primary can be elected (the RS can still elect a new master by majority). On the other hand, if you loose two members in a 3 members replica set or a 4 members replica set (or n/2 members of n-members replica set) – again – the impact is the same: No new leader can be voted by election.
So, to make a long story short, there is no redundancy gain by having an even number of members in a replica set.
For more see election internals
Though I understood, Odd nodes concept in a Replica set a
bit– it raises me following questions.
More success for odd number concept is an internal mechanism
of MongoDB ? Or is just to increase Fault tolerance ?
Is odd number is a Replica Set is a recommendation(preferable)/mandatory
for Productions ?
Voting is possible when the “No. of available Nodes/Total nodes” ratio > 0.5, is it correct
in all cases ?
Provide me some scenarios/examples where exactly odd number
nodes get success. Also, I didn’t understand
the n/w partition example properly – please explain.
In a 6 Node RS (only PS, No Arbiter), even failover work
until two nodes goes down (4/6 > 0.5) , that means regardless of even/odd
numbers in a Replica set – upto some node failures you’ll get success and it’s
depends upon your No. of nodes, more nodes – the rate of success is more.
And more over, fault tolerance of 5 node & 6 node
replica set is same and it 2. ( Source :
http://docs.mongodb.org/manual/core/replica-set-architectures/)
Finally, If in a 3-node replSet, if the PRIMARY goes down, we
see that the set elects a new PRIMARY and everything is fine without
experiencing any downtime.
But if another member
goes down (2 total down), the 1 remaining member does not become PRIMARY and a
complete outage happens and we understand this is because the replSet does not
have a majority for an election.
But shouldn’t our 1 surviving member be able to
function on its own? Is there a way to configure it so that we get this
behavior? Or can we expect this in near future.