These are chat archives for atomix/atomix

26th
Feb 2018
Jordan Halterman
@kuujo
Feb 26 2018 17:53
Working on the release now
Jordan Halterman
@kuujo
Feb 26 2018 18:06

@imperatorx Atomix currently doesn’t delete any primitive state machines. This is more a product of how we currently use it than anything. We have had some ideas about different ways to manage state machines, and one of them has been expiring them. I think one of the ways we could do that is by adding an option to automatically delete the state of a primitive once all its sessions are disconnected.

Any of these ideas would be very simple to implement in a single Raft partition, but it gets a little more complicated in partitioned primitives where we’d have to add an external timer with a two-phase commit protocol to ensure deleting primitive state is atomic, but TBH I suppose we have to do that anyways.

Jordan Halterman
@kuujo
Feb 26 2018 18:11

@johnou there’s a decorator tests can use to inject a cluster:

@with_cluster(nodes=5)
def test_test(cluster):
    ...

That would automatically create and destroy the cluster before/after the test. I kept both to make it possible to define the test cluster externally for some tests, e.g. model checking.

Anyways, I think the test framework is pretty much done-ish. We can start writing tests at least and improve upon t from there. I’m going to document it and write some basic tests for examples.
Johno Crawford
@johnou
Feb 26 2018 19:57
@kuujo quite a few test failures on CI with the master branch
Jordan Halterman
@kuujo
Feb 26 2018 20:07
yeah I’m thinking...
Jordan Halterman
@kuujo
Feb 26 2018 22:21
we must have broken something somewhere along the way
can’t reproduce them in IntelliJ
Johno Crawford
@johnou
Feb 26 2018 22:23
perhaps I messed up the merge from 2.0 branch
one test is failing on my machine when running RaftTest when it's under heavy load, trying to debug that
[ERROR] Errors:
[ERROR] RaftTest.testThreeNodesEventsAfterFollowerKill:934->testEventsAfterFollowerKill:969->ConcurrentTestCase.await:105 » Timeout
my machine is failing on testFiveNodesEventsAfterFollowerKill
ci failed on that after the first batch of 2.0 fixes
Johno Crawford
@johnou
Feb 26 2018 22:58
of course even when generating artificial load and running the test i'm not able to reproduce it..
just sometimes fails when running the entire RaftTest
i got a trace log of a fail, want a look?
Jordan Halterman
@kuujo
Feb 26 2018 23:22
Actually I think that is a test that just used to be disabled by I reenabled it
but*
Johno Crawford
@johnou
Feb 26 2018 23:23
  @Test
  @Ignore // Ignored due to lack of timeouts/retries in test protocol
  public void testThreeNodesEventsAfterFollowerKill() throws Throwable {
    testEventsAfterFollowerKill(3);
  }

  /**
   * Tests submitting sequential events.
   */
  @Test
  @Ignore // Ignored due to lack of timeouts/retries in test protocol
  public void testFiveNodesEventsAfterFollowerKill() throws Throwable {
    testEventsAfterFollowerKill(5);
  }
yep, although these are not the only ones failing on CI
just so happens that testThree* fails on my local every so often
[ERROR] testActiveJoinLate(io.atomix.protocols.raft.RaftTest) Time elapsed: 10.622 s <<< FAILURE!
java.lang.AssertionError: expected:<null> but was:<java.util.concurrent.CompletionException: io.atomix.primitive.PrimitiveException$Unavailable: Failed to reach consensus>
at io.atomix.protocols.raft.RaftTest.lambda$submit$1(RaftTest.java:208)
[ERROR] RaftTest.testThreeNodeSubmitQueryWithLinearizableConsistency:557->testSubmitQuery:618->ConcurrentTestCase.await:98 » Timeout
Jordan Halterman
@kuujo
Feb 26 2018 23:33
The Failed to reach consensus errors are probably unrelated to merge issues, but the test timeouts shouldn’t be happening
Johno Crawford
@johnou
Feb 26 2018 23:33
      session.invoke(EVENT, clientSerializer::encode, true).thenRun(this::resume);

      await(30000, 2);
why two resumes btw
Jordan Halterman
@kuujo
Feb 26 2018 23:34
Probably because there’s earlier code that receives the event and calls resume(), so it’s waiting for the event to be received and the command to return
Johno Crawford
@johnou
Feb 26 2018 23:34
ah yes in the listener