These are chat archives for atomix/atomix

11th
Jan 2017
Jordan Halterman
@kuujo
Jan 11 2017 02:04
I can do it any time you guys feel good about it
not particularly waiting on anything else
Thiago Santos
@thiagoss
Jan 11 2017 02:45
Finished tests a while ago. Couldn't reproduce the issue. Thanks @kuujo !
Jordan Halterman
@kuujo
Jan 11 2017 02:47
@jhall11
I mean, I have other stuff in progress but this is critical bug fix so no need to wait
That's awesome!
It was a great find, as I like to say
Jordan Halterman
@kuujo
Jan 11 2017 08:47
I actually think I managed to get all my changes into Atomix anyways… I’ll release it tomorrow if we need it to be released
hazemkmammu
@hazemkmammu
Jan 11 2017 09:55

@kuujo The way I understand it, the only way to send a direct message is,

GroupMember member = group.member("foo");
MessageProducer<String> producer = member.messaging().producer("bar");
producer.send("baz").thenRun(() -> {
  // Message acknowledged
});

group. member("foo") returns null when the node "foo" has failed/crashed. So there is no way to send a direct message when a node that is a persistent member has rejoined. Thanks for explaining the SYNC, ASYNC options. However they are relevant only after the message is sent.

hazemkmammu
@hazemkmammu
Jan 11 2017 10:01

I also noticed something else (unrelated to the concern above).

I left my web application idle for 3+ hours and when I came back and tried to broadcast messages from one instance, the messages were never sent. That means the messages were not acknowledged. I think the reason is because the persistent membership session has expired and any node listening to it's message consumer queue would be invalidated. That's all right.

To confirm this behaviour, I called joinGroup with DistributeGroup.Config options with member expiry duration set to 2 minutes. However, I never saw the session expiring after waiting for 30 minutes. Did I do something wrong?

Jordan Halterman
@kuujo
Jan 11 2017 10:01
Yeah… so currently DistributedGroup can be configured with an expration that defaults to Duration.ZERO but prevents persistent members from being removed for the duration after a failure. That certainly doesn’t seem like a great default behavior, but the problem is that groups can and should also be able to detect when a persistent member crashed. So, I think that expiration needs to be removed and a status constant and listener needs to be added to GroupMember instead
hmm...
hazemkmammu
@hazemkmammu
Jan 11 2017 10:05
Can group.onLeave(..) be used to listen to a persistent member crash ?
Jordan Halterman
@kuujo
Jan 11 2017 10:07

It can with the current behavior, but I don’t think that’s correct. A persistent member shouldn’t be removed from the group when its session expires. Currently, when a client with a persistent member’s session expires, the state on the servers isn’t removed, but events are sent to clients notifying them that the member was removed. It should just be a state change for persistent members.

One sec and I’ll submit a PR...

hazemkmammu
@hazemkmammu
Jan 11 2017 10:07
OK
Jordan Halterman
@kuujo
Jan 11 2017 10:14
just have to write a few tests
hazemkmammu
@hazemkmammu
Jan 11 2017 10:14
OK. No problem. I will post my questions here, please reply when you can.
I would like to make sure I understand this right. DistributedGroup.Config.withMemberExpiration(..) specifies how long to wait before removing a persistent member after it has crashed. Correct?
hazemkmammu
@hazemkmammu
Jan 11 2017 10:23
Or how long to wait before sending a 'member removed' event after the last client session activity ?
Jordan Halterman
@kuujo
Jan 11 2017 10:52
indeed… the latter
Jordan Halterman
@kuujo
Jan 11 2017 11:18
The use case being, it’s possble for a member’s session to expire but for the node to still be alive, in which case the client will automatically recover and can receive messages that were sent while it was down. Immediately removing the member on clients is just not the right behavior.

Alright, so I submitted a PR that adds a GroupMember.Status enum and events for changes in persistent members’ statuses:

group.member(“foo”).onStatusChange(status -> {
  ...
});

When a persistent member’s session is expired, a status change event will be sent to clients rather than a leave event. When it reconnects with a new session, the status will change back to ALIVE. If a persistent member’s session is explicitly closed by the client, it will still be removed from the group.

and the expiration still applies if someone feels compelled to automatically remove a member after x amount of time, but it defaults to -1 (infinite)
Jordan Halterman
@kuujo
Jan 11 2017 11:24
ugh totally messed up that PR but submitted another one
this is a really nice change and makes it feel a lot more correct and useful
hazemkmammu
@hazemkmammu
Jan 11 2017 12:50
@kuujo Awesome! Thanks :)
Jordan Halterman
@kuujo
Jan 11 2017 19:18
k all the tests look good and it’s merged into master and will be released… maybe today?
I don’t think there’s anything else left to be done
Jon Hall
@jhall11
Jan 11 2017 21:44
I think from my perspective it looks good
Jordan Halterman
@kuujo
Jan 11 2017 22:08
Alright it's on its way. Sort of multi-tasking but should be done in a couple hours
Chris Cleveland
@ccleve
Jan 11 2017 22:13
How does resource caching work in Atomix? We need to access some values in a DistributedMap several hundred times per second. Is the map backed by a local, in-memory cache, or does it do a network round-trip every time we call map.get()? (I'm wondering whether I should create my own, local cache which gets updated whenever it gets notified that the underlying map changes, but I don't want to reinvent the wheel if that is already how the map is implemented internally.)
Jordan Halterman
@kuujo
Jan 11 2017 23:01

ahh… that was actually something I was thinking about doing as well. I think there’s certainly a use case for a local cache in the map resource. But first to answer your question:

The cost of a read depends on the consistency level of the read. If you do:

map.get(“foo”, ReadConsistency.ATOMIC);

That will be an expensive read that’s guaranteed to be linearizable but requires a heartbeat to a majority of the cluster. So, the read will typically go to a follower, be proxied to a leader, the leader will ensure it’s still the leader by sending a heartbeat to a majority of the cluster, then the leader will respond to the follower and the follower will respond to the client. This is actually a bit more optimized than that. In practice, if a lot of reads are being submitted then the leader will batch them together so it’s not sending heartbeats around the cluster constantly. Also, the client can connect directly to the leader. But it’s still not ideal.

However, if you do:

map.get(“foo”, ReadConsistency.ATOMIC_LEASE);

That will still go through the leader but won’t require a heartbeat, so usually it’s a linearizable read but with some risk.

Finally, if you do:

map.get(“foo”, ReadConsistency.SEQUENTIAL);

A replica will usually read the state from the local state machine if it’s not too far behind the leader. A client’s request will be serviced by the first server that receives it if the server’s not too far behind the leader. But this decreases the consistency. State changes are guaranteed to occur in order from the perspective of the client, but not all clients will see the same state at the same time.

But those are the same guarantees we can get by just caching changes locally, so it makes sense to support a map-side cache. All we need to do is update DistributedMap to add ADD, CHANGE and REMOVE listeners and copy the events into a local map. We can add a ReadConsistency.LOCAL that’s only supported if a local cache is supported/enabled by the resource.

Jordan Halterman
@kuujo
Jan 11 2017 23:14
I'll throw something together before this release
Chris Cleveland
@ccleve
Jan 11 2017 23:17
Please correct me if I'm misunderstanding: you can set up a large Atomix cluster that has a few stateful nodes and many that are just listeners/clients. The non-stateful nodes would not have a local map. Correct?
I'm trying to set up a system akin to a Zookeeper cluster where you have a small number of nodes that manage the state, and a large number of clients. In that way I can have a cluster of 100 nodes or more without having a ton of heartbeat traffic.
Jordan Halterman
@kuujo
Jan 11 2017 23:24
I think you could do any of those things. A ZooKeeper like system would use the standalone server to start a cluster of three or five nodes and use AtomixClient to communicate with them. Servers work almost exactly like ZooKeeper in that they hold state in memory and respond from that state, but Atomix is more configurable, allowing for linearizable reads without doing a sync-and-read. If you’re using a map that’s small and is not changed frequently, though, I think it would make sense and would be quite simple to cache the map on 100 clients. The consistency guarantees are effectively the same, but network costs are only incurred when there’s a write to the map, rather than making a request to a server each time you want to read the map.
Chris Cleveland
@ccleve
Jan 11 2017 23:28
Yes, I think that's the right approach. I could manage the local map in my application code if there were some kind of listener that would tell me whenever a key had changed or been removed. Although if there were a 100+ nodes, that would be a lot of work for the leader to talk to all of them on every change. Would it make sense to gossip the changes around?
Jordan Halterman
@kuujo
Jan 11 2017 23:28
depends on how often that change happens
Chris Cleveland
@ccleve
Jan 11 2017 23:29
In my app, not often, but we'd have to design for the general case.
Jordan Halterman
@kuujo
Jan 11 2017 23:30
Atomix has a gossip protocol but it’s not really well tested - nowhere near as well tested as the consensus algorithm. And it may not be ideal since it gossips all state changes, including sessions and stuff. General gossip could be fine. Maybe set up 3 or 5 AtomixReplicas, listen for state changes in the DistributedMap, and then send them over a gossip protocol. That’s most efficient if the consistency issues aren’t a concern.
but last night I did add those events to DistributedMap, so they exist already. We’re documenting them right now
map.onAdd, map.onUpdate, map.onRemove
Keep in mind that clients continuously send heartbeats to the cluster as well, so there’s always some communication. If you have 100 clients then you’re getting 100 periodic keep alives, though the keep-alive interval is configurable. Sending 100 events on every change to a map would indeed be costly. But acknowledgements are relatively cheap since they’re done in batches in the keep alives that clients have to send anyways.
Jordan Halterman
@kuujo
Jan 11 2017 23:35
that’s actually an area where ZooKeeper is a little more efficient than Atomix because it sacrifices some guarantees
ZooKeeper events aren’t guaranteed to be received by a client. They’re just published and forgotten about. But Atomix tracks and replicates events like normal state and awaits acknowledgements from clients before they’re removed from memory, so a client can switch servers and not lose events