These are chat archives for atomix/atomix

17th
Feb 2017
Jonathan Halterman
@jhalterman
Feb 17 2017 00:31
@jhall11 A few days late, but the docs for Trinity are public, just a diff URL http://atomix.io/trinity/api/
Jonathan Halterman
@jhalterman
Feb 17 2017 00:37
...Fixed the doc link in the readme.
Jon Hall
@jhall11
Feb 17 2017 00:44
cool!
thanks
Jordan Halterman
@kuujo
Feb 17 2017 09:02
Copycat 1.2.1-rc1 and Atomix 1.0.1-rc1 released
Johno Crawford
@johnou
Feb 17 2017 09:03
@kuujo congrats!
I noticed in an old GitHub ticket that you planned to remove direct messaging, was that reversed?
János Pásztor
@janoszen
Feb 17 2017 15:58
Hello everyone. I'm back with another potentially stupid question. If I set the REQUEST_REPLY policy on a message, how do I get said reply?
Also, can I somehow get the member the message came from?
Jordan Halterman
@kuujo
Feb 17 2017 16:32
@johnou I don't have any plans to remove the current incarnation right now. But messaging is sort of a questionable topic to me. I mean, there's certainly a use case for it, but I think only within the context of cluster management/configuration/coordination, which is why messaging is now built in to DistributedGroup. The messaging APIs use a pretty strong consistency model that's expensive, so they're poor for general messaging. Their existence at all risks them being abused, just as I think the same thing has been abused in ZooKeeper in some cases. But there's still a use case for strongly consistent messages between clients particularly in DistributedGroup, so I think their value still outweighs that risk right now.
Jordan Halterman
@kuujo
Feb 17 2017 16:41
@janoszen the send(Object) method on MessageProducer returns a CompletableFuture that will be completed once all replies are received. If the producer is direct, it will be completed with the reply of the single member. If the producer's not direct, it will be completed with a Collection of replies. MessageConsumers reply to a message with Message.reply, ack with Message.ack, fail with Message.fail, etc.
János Pásztor
@janoszen
Feb 17 2017 16:42
OK, but how do I get the contents of the response?
Jordan Halterman
@kuujo
Feb 17 2017 16:47

The CompletableFuture should be completed with the response:

producer.send("foo").thenAccept(response -> { ... });

Of course, the request/response just need to be serializable.

As for getting the member from which a message came, we'd just have to add a source() or something to Message.

János Pásztor
@janoszen
Feb 17 2017 16:48
Hmmm ok. So I'd need to insert that on the client side. Not ideal, but understandable.
The thenACcept method does not seem to work. Even though I'm using the message.reply() function on the other side
Jordan Halterman
@kuujo
Feb 17 2017 16:53
The thing that makes messaging difficult/risky, is blocking in this type of situation can be expensive. You can use join() to block for the response, but you're waiting for two commits to the cluster and two events from the cluster, which is effectively blocking on a bunch of round trips. Synchronous messaging is really what's impractical/expensive and may be removed in the next version to force users to explicitly take on the cost of separate writes to the cluster.
János Pásztor
@janoszen
Feb 17 2017 16:54
My messages aren't really meant to be important. They are basically just requests to cluster members to identify themselves. If they don't, the next group ID request will catch them 10 seconds later
I was not expecting messages to be consistent
Anyway, I solved my identifying-the-sender problem by casting it to GroupMessage
Jordan Halterman
@kuujo
Feb 17 2017 17:55

@janoszen that works :-)
But IMO you should probably just pass metadata for each group member to do this.

LocalMember member = group.join("1", new SomeMetadata()).join();

The metadata can be read from any GroupMember object:

group.onJoin(member -> {
  SomeMetadata metadata = member.<SomeMetadata>metadata().get();
});

That metadata can be an arbitrary (serializable) object.

János Pásztor
@janoszen
Feb 17 2017 17:55
Hmmm I've added the data I need to the message itself, but this also looks cool
I just needed to do a little trickery to get the information from Docker
Jordan Halterman
@kuujo
Feb 17 2017 17:56
Well, the benefit is that you're getting rid of race conditions after a member joins if you're currently having to send a message to ask for info from a new member
János Pásztor
@janoszen
Feb 17 2017 17:56
mapped ports and all
Jordan Halterman
@kuujo
Feb 17 2017 17:56
@jhall11 FYI if you didn't see releases should be in Maven central
János Pásztor
@janoszen
Feb 17 2017 17:57
I will definitely change to this once the rest is running
János Pásztor
@janoszen
Feb 17 2017 18:03
One final question and then I'll leave you guys alone. When I'm getting the message "Suspected network partition. Stepping down." Can I react to that somehow? When does this happen?
If I have nodes joining and leaving constantly, can this happen?
Jon Hall
@jhall11
Feb 17 2017 18:06
@kuujo I just built using the rc from Maven Central, so far, so good
Jordan Halterman
@kuujo
Feb 17 2017 18:28
@janoszen good question. You can't react to it. It's really just an internal log indicating that a leader is stepping down. This will happen when a leader fails to contact a majority of the cluster for a while. The reason is because if a leader can't contact a majority of the cluster, it's possible anther leader could have been elected. We don't want a partitioned leader to stick around and keep accepting client requests that will never complete. Forcing the leader to step down forces clients to try to find another leader. This should really be a fairly infrequent occurrence, but changing quorum sizes could perhaps make it more likely. If you don't remove nodes correctly (using leave rather than shutdown) then it will definitely happen and your cluster will eventually just become completely unavailable (since the quorum size doesn't change with shutdown).
János Pásztor
@janoszen
Feb 17 2017 19:08
OK, so if a node crashes, the other nodes need to remove it from the cluster
(I have 3 "special snowflake" nodes)
Jordan Halterman
@kuujo
Feb 17 2017 22:51
That's right... but keep quorum sizes in mind. This is something that's still be experimented with in Atomix. There are some patterns that in theory can be used to manage a dynamic cluster. The problem with adding nodes to a Raft cluster is it can cause a period of unavailability if the quorum size changes. So, typically the cluster has to replicate data to it and then add the node to the quorum. Similarly, removing a node can decrease the quorum size and thus fault tolerance. So, Copycat actually supports adding nodes in a state (PASSIVE) in which they're not counted in the quorum but are kept nearly in sync with the Raft cluster. In Atomix, the experiment that's being done uses this feature to create a dynamic cluster by quickly replacing failed nodes. There are some challenges with this since the core Raft nodes have to have explicit knowledge of all other core Raft nodes. But where this becomes particularly interesting is with sharding as is done in ONOS. The problem with sharding of corse is even when a minority of the cluster goes down, you could still lose the majority of a shard and thus have partial unavailability. So, to address this you use a core Raft cluster to store partition configuration information, and make individual partitions dynamically scalable by reconfiguring the cluster when the status of a node changes (which Copycat tracks). When a node is partitioned from the leader, promote a passive node and then demote the active one, thus replacing an unavailable follower with an available one. This dynamic management of partitions ensures more than a minority of nodes within a partition can fail as long as they're not simultaneous failures (nodes are on different machines or racks). There's work that needs to be done to make Atomix rack-aware for that use case, though.
I'm babbling about nonsense