These are chat archives for atomix/atomix
We are no longer monitoring this channel, please join Slack! https://join.slack.com/t/atomixio/shared_invite/enQtNDgzNjA5MjMyMDUxLTVmMThjZDcxZDE3ZmU4ZGYwZTc2MGJiYjVjMjFkOWMyNmVjYTc5YjExYTZiOWFjODlkYmE2MjNjYzZhNjU2MjY
AtomixReplicais that it’s still limited to 3 or 5 nodes. You can’t just start n number of replicas and let the system figure out the replication itself. But now you can, or will be able to. Essentially, the user defines the number of Raft nodes and the number of backup nodes per Raft node, and Atomix dynamically scales the Raft cluster as nodes are added and removed. If a Raft voting member dies or is partitioned for long enough, Atomix can replace it by promoting another replica to prevent loss to availability. Most importantly, this means you should be able to start 20 replicas and tell it you want 5 Raft nodes, and the cluster will continue operating even if the original 5 Raft nodes die. Probably will take a couple more days to go through all the testing.
AtomixReplicaand the rest create an
AtomixClientor you use a separate set of
AtomixServers with clients. But in this case, you can just create an
AtomixReplicaon each node and let Atomix figure it out. The user provides a
quorumHintand Atomix will ensure at least that many Raft nodes exist at all times. More interestingly, this also provides an additional level of fault tolerance. If a Raft node dies, Atomix can replace it with a non-Raft node to bring the number of alive Raft nodes back to
quorumHintand thus return the actual fault tolerance to the desired level. This allows it to behave more closely to HA systems where you can arbitrarily add and remove nodes, though adding and remove nodes is still done in a safe manner, and too many Raft nodes failing simultaneously can still cause a loss of availability. i.e. if the
3and two Raft nodes die at the same time, the cluster will become unavailable because Atomix/Copycat still use the Raft algorithm to do cluster configuration changes.
quorumHit, Raft followers begin replicating state changes to additional nodes. So, if the
3and you add a fourth node to the cluster, one of the Raft followers will replicate committed entries to the new node. Clients can potentially read from that additional node and still maintain sequential consistency, and in the event that a Raft node fails it can be more quickly replaced by the fourth node since that node is already up-to-date. Copycat doesn't allow a new Raft node to join the existing Raft cluster until its state is caught up to prevent availability issues, but this ensures that process of replacing a Raft node takes only a few seconds.
AtomixReplicas and let them figure it out amongst themselves, though that should not be misconstrued as scalability. You're scaling the number of replicas that can be supported, but not the amount of state or the throughput of state changes.