These are chat archives for atomix/atomix

13th
Oct 2017
Jordan Halterman
@kuujo
Oct 13 2017 02:15
I'm on vacation :-)
Jordan Halterman
@kuujo
Oct 13 2017 02:21

But IMO the idea of preferring leaders is in conflict with the Raft algorithm. Leader election is used in Raft to ensure safety, and strictly preferring leaders would make it unsafe. Raft's leader election algorithm is designed itself to prioritize nodes by whether or not they're up to date. So prioritizing leaders can only safely be done within the context of nodes that are up to date, which is constantly changing.

IMO thinking about changing the Raft election protocol to include preferences is the wrong way to think about it. Preferences in the leader election algorithm can necessarily only include a majority of the nodes in the cluster, because the election algorithm is designed to elect a leader from the majority. So what if the preferred leader is in the minority? But Raft has a mechanism for transferring leadership that can allow it to be transferred to any node. So, the right way to prefer specific leaders IMO would be by using leadership transfer, periodically checking leadership and availability and transfer leadership to specific nodes by priority. This is basically what we're planning to do in our own project (ONOS).

The difference between using leadership transfer to balance leaders and modifying the election algorithm is the former will catch up and transfer leadership to any available follower, while the latter has to be concerned about safety first and foremost. Much more external control using leadership transfer.
Jordan Halterman
@kuujo
Oct 13 2017 02:27
But you do have to be careful not to repeatedly transfer leadership to a node that shouldn't be winning elections, which is why you should perhaps e.g. use a failure detector or something. Leader-based protocols like Raft generally tend towards the fastest node as the leader and the fastest set of nodes as the majority, so repeatedly transferring to the minority could theoretically be problematic and may just end up causing more leader changes. Leader changes in general are one of the most expensive operations in our own clusters. You'd probably just want to keep a recent history of leadership to avoid that.