These are chat archives for atomix/atomix

Dec 2017
Jordan Halterman
Dec 06 2017 04:54

FYI the "last bit of refactoring" in Atomix 2.1 is the addition of an alternative multi-primary replication protocol. The Raft primitives and state machines of yore are now generic primitives/RSMs abstracted from the underlying storage/replication protocol. Primitives in Atomix 2.1 configure the underlying protocol based on the configured Consistency, Persistence, and Replication for the primitive.

The multi-primary protocol is a Hazelcast-like protocol that doesn't suffer from many of the same consistency issues. Hazelcast and systems like it tend to partition data based on an eventually consistent view of the cluster, resulting in the potential for split brain and data loss. But because Atomix has a consensus protocol build into it, it uses the consensus protocol to elect primaries for each partition. This avoids split brain and provides stronger consistency guarantees for the new multi-primary replication protocol.

The multi-primary protocol works by simply assigning a primary and backups for each primitive partition. Changes can be replicated either synchronously or asynchronously to any number of nodes depending on the primitive configuration. All multi-primary state is in-memory only. Additionally, the client protocol is significantly more lightweight than the Raft client protocol in that it relies much more heavily on TCP ordering/delivery guarantees for bi-directional client-server communication (the Raft implementation adds significant coordination on top of TCP to guarantee e.g. sequential ordering and guaranteed delivery when switching leaders). These properties together provide an extremely fast alternative to Raft-based primitives.

The multi-primary protocol is actually being developed for a real-world use case as well. We will begin integrating Atomix 2.1 into ONOS in the coming weeks, where we'll use the multi-primary protocol to store certain network state in a more deterministic manner. This will provide significantly faster access to performance critical primitives by co-locating the data with the nodes that accesses it.

Anyways, the multi-primary protocol will be in Atomix 2.1, and the first release is targeted for around January 1st.

Johno Crawford
Dec 06 2017 08:49
sgtm, with all the refactoring did that include lifting the joining / leaving code from ONOS into atomix or is that still on the to do list?
Jordan Halterman
Dec 06 2017 19:31
Ahh damn I forgot about that 🤔
Going to have to get it back in to get this into ONOS anyways