These are chat archives for atomix/atomix

28th
Sep 2018
Luca Burgazzoli
@lburgazzoli
Sep 28 2018 08:55
is there any chance to get atomix/atomix#838 included in the next release ?
Alex Robin
@alexrobin
Sep 28 2018 09:25
@kuujo ok I'll work on a pull request to enable retrieval of memberId from a service implementation
Luca Burgazzoli
@lburgazzoli
Sep 28 2018 16:54
@kuujo is it possible to create a full ephemeral cluster where all the nodes are clients? my use case is for messaging and leadership, no need to persist any data
Jordan Halterman
@kuujo
Sep 28 2018 19:08

@lburgazzoli just use a primary-backup partition group. The protocol is just in memory, really similar to Hazelcast.

managementGroup {
  type: primary-backup
  partitions: 1
}

partitionGroups.data {
  type: primary-backup
  partitions: 31
}

The risk is just without any consensus layer (at least in the management group) the entire cluster is effectively eventually consistent, dependent on network conditions, and prone to split brain.

Luca Burgazzoli
@lburgazzoli
Sep 28 2018 19:27
@kuujo thx, and is there a configuration to get consistency without the need to have data stored on the disk ?
Jordan Halterman
@kuujo
Sep 28 2018 19:58

Technically, no. There’s currently an option to use MEMORY logs in Raft partitions, but that’s not really safe either. It’s not tolerant to the loss of a majority of any partition because of how the Raft protocol works. Once a majority of nodes in a partition go down and restart with no memory, any node can get elected leader, meaning all the data can be lost. For that reason, the MEMORY setting will be deprecated and removed soon.

So, the problem is that Raft is where we get consistency from, and the Raft protocol inherently requires persistence. One could be tempted to make the argument that leadership is transient and if the management group loses data then new primaries can be re-elected, but the Raft management group is also used to determine backups for the primary-backup protocol. It’s critical that when a Raft management partition loses availability and then regains it, it either re-elects the same primary or one of its backups. So, losing availability in a MEMORY management group could also mean losing data in primary-backup partitions.

This is really the same as any other distributed system: you can use Hazelcast for leader election, but you’ll get multiple leaders in a network partition. You have to switch to ZooKeeper - a quorum-based, persistent atomic broadcast protocol - to ensure there’s only one leader in a network partition.
Luca Burgazzoli
@lburgazzoli
Sep 28 2018 20:03
@kuujo thank you very much for the explanation
so now a complete different beast, and I hope I wont' bother you any more
I'm getting "Class is not registered: io.atomix.cluster.MemberId, Note: To register this class use: kryo.register(io.atomix.cluster.MemberId.class);"
when joining an election with a code similar to the one shown in the doc:
MemberId localMemberId = atomix.getMembershipService().getLocalMember().id();
Leadership<MemberId> leadership = election.run(localMemberId);
Alex Robin
@alexrobin
Sep 28 2018 20:07
@kuujo I am actually currently investigating a bug in the exact configuration you mentioned. I have a Raft management group (with persistent log storage), and I have a primary-backup group for data with a single partition. There are times where the client retrieves the wrong primary node for the primary-backup group and I haven't figured out why yet.
What I got down to is that the problem is already present on the primitive service side (inside PrimaryElectorService). Do you have any insights of what could be happening?
Alex Robin
@alexrobin
Sep 28 2018 20:12
I have a feeling it happens when rebalancing primaries... Maybe the selected primary is actually the good one but the selected primary rejects execute requests because it is still has backup role. Have you run into this already?