These are chat archives for atomix/atomix

29th
Dec 2015
Richard Pijnenburg
@electrical
Dec 29 2015 04:41
@kuujo you there?
Jordan Halterman
@kuujo
Dec 29 2015 05:48
yo
@electrical indeed
Jordan Halterman
@kuujo
Dec 29 2015 06:04
And Merry Christmas to you too :-) I’ve been out with family a lot. @jhalterman is here visiting for a couple days. I’ll be around a bit though. Still a work week for me!
Richard Pijnenburg
@electrical
Dec 29 2015 09:43
Hiya. Sorry. Went to sleep :) was a bit to late for me.
Jordan Halterman
@kuujo
Dec 29 2015 09:44
all good
I probably gotta get to bed soon too
haha
backwards times
Richard Pijnenburg
@electrical
Dec 29 2015 09:46
Hehe yeah bloody timezones. I should be online later at your morning.
Jordan Halterman
@kuujo
Dec 29 2015 09:46
sounds good
Richard Pijnenburg
@electrical
Dec 29 2015 09:47
I haven't found out why the client stops working :(
Very weird stuff
Jordan Halterman
@kuujo
Dec 29 2015 10:36
@electrical I know… The name of this branch turned out to be the opposite of reality, but here’s the work I’ve been doing on the client: https://github.com/atomix/copycat/commits/client-simplification
I redesigned the client from the ground up to be more componentized. The previous client was not test friendly at all, but this version has unit tests in place for all of the client’s major responsibilities: https://github.com/atomix/copycat/tree/client-simplification/client/src/test/java/io/atomix/copycat/client
It also manages session in a much more elegant way. I don’t expect that client to be working fully right now. Indeed, I just started testing it in a cluster a few minutes ago. But I do expect to put it through a lot of testing tomorrow and it shouldn’t take long to clean up, and I do think it should be a huge improvement over the previous incarnation. The way the client now tracks and recovers sessions will be much more reliable for ensuring a partitioned client recognizes it’s partitioned and takes additional measures to e.g. ensure two clients don’t believe themselves to hold the same lock at the same time because one is partitioned. I’ll explain more in documentation, but generally speaking the client more quickly detects failures, but at the same time is more likely to retain its session after a partition. Some work will have to be done in Atomix to take advantage of the client’s tracking.
…and this should fix the issue with the old client
Richard Pijnenburg
@electrical
Dec 29 2015 10:38
Ahh very nice ! I'm looking forward to try it out.
Jordan Halterman
@kuujo
Dec 29 2015 10:38
The issue with the old client was that commands were not handled properly across sessions. Without explaining all the internals of Copycat, Copycat requires that commands be sequenced in a certain way. In cases where a client lost its session, that sequencing was also lost and caused the client not to be able to submit any more commands. That should be resolved here. I’ll clean it up tomorrow and post an update.
Richard Pijnenburg
@electrical
Dec 29 2015 10:39
Cool!
Jordan Halterman
@kuujo
Dec 29 2015 11:01
Really, the problem with the client was twofold: it lost sessions more frequently than it should have and it didn't resubmit commands correctly when it did lose a session. The problem is, when a session is lost and a new session is opened, command sequence numbers start back at 1, but the current client never resets them. It was really a design problem that made it challenging to submit commands across sessions, hence the rewrite. This version definitely addresses both problems. It adds a state wherein the client doesn't know if it's session is lost because it can't contact the cluster. In that case, a lock will preemptively be released on the client to ensure there's no conflict while the client can't talk to the cluster, but the client won't close its session. Once it's able to talk to the cluster again it will try to reestablish its session. If the cluster expired its session it will open a new session and transparently resubmit commands from the previous session. What this amounts to is Atomix clients that can be disconnected and reconnected without any impact to the user. Most of the tests are passing now, but there's still a bug or two in the client that should be simple to fix after a bit of rest :-) adios
Richard Pijnenburg
@electrical
Dec 29 2015 11:02
Hehe. Cool. Catch you tomorrow.
Richard Pijnenburg
@electrical
Dec 29 2015 23:59
@kuujo any progress with the client stuff?