These are chat archives for atomix/atomix
State changed: SUSPENDEDthis is actually the correct result since it couldn’t commit a keep-alive to the cluster (a leader change occurred), but it should have detected that its session was expired. Gotta try to reproduce this
unknown sessionlogs on servers are a result of the client not knowing its session was expired
UnkownSessionExceptionfor keep-alives on every even session number, and clients properly detect and recover the session in Copycat. This could be an issue with the Atomix
RecoveryStrategythough. This is good
RecoveryStrategyand it’s an easy fix :-) Thanks @electrical!
open()method to resources
CopycatClientinterface now exposes
States which indicate when the client loses its session, and so Atomix resources use their own
InstanceClient) which monitors the real
CopycatClientfor state changes and creates a new logical session if the underlying client’s session changes.
logback.xmlfile controls the logging. Others are normal Java logging and log4j
io.atomix.catalystlogging for example, or they set it to
AppendRequestto a follower, but the follower can't send an
AppendResponse, the follower will never start a new election, and the leader will never be able too commit any entries, thus preventing state from progressing altogether during the partition.
AppendRequestfor long enough for a follower to try to get elected
LeaderAppenderwhen snapshotting was added a while back. @electrical I should have a version with all the bugs I can reproduce fixed by your morning. I haven't been able to reproduce the leader changes you're seeing, but I did find and fix some bugs that were related to at least some of those leader changes. I think a day of cleaning up and testing that specific logic should get the leader back in working order as it was working great before. Jepsen tests have only ever caused leader changes during network partitions. This should put it back in that state and ready for RC.