These are chat archives for atomix/atomix
@txm119161336_twitter the reason for the
PROMOTABLE role is actually to allow the leader to catch up a follower before it becomes a voting member. When a node is added to the Raft cluster, it’s first added as a
PROMOTABLE member which means that it can’t vote in elections but does receive entries from the leader. Once it’s caught up to the leader, it’s then promoted to an
ACTIVE voting member.
The reason this is done is because adding a node that doesn’t have all committed entries can block writes to the cluster. For example, if a 2 node cluster adds a third node and then one of the original 2 nodes crashes, the cluster becomes blocked until the leader can replicate all its changes to the added node. By catching it up before it becomes a voting member we reduce the chance the cluster can become unavailable.
Yeah, that’s not the reason for this particular change, but TBH I’m not confident in the Raft configuration protocol for exactly this reason. The protocol is not proven, and configurations are inevitably applied at different times on different nodes.
The idea behind the Raft configuration change protocol is:
• The leader logs a configuration change entry and immediately applies the change
• Nodes apply the change as the configuration change entry is replicated
But there’s then a chance that a configuration change may be applied to one or more nodes but never actually be committed. So, I think to handle this safely, nodes would actually have to preserve a record of the prior configuration to be able to revert to that configuration in the event a configuration change is unsuccessful (some other term is committed at the configuration change index). Nothing like this is implemented in either version.
Another option would be for the leader to commit the change without applying it and then count the number of nodes that have applied it to determine when it’s safe to commit the next configuration change. This seems to me like a safer albeit more complicated approach. But at least it guarantees a configuration change cannot be undone after it’s applied.
Anyways, I’m another note, I introduce support for member groups (zone/rack/host aware replication): atomix/atomix#463
I need to do some more experimenting with the implementation to gain some confidence it’s correct, but it sure looks cool :-P