These are chat archives for atomix/atomix

5th
Jul 2016
Roman Pearah
@neverfox
Jul 05 2016 15:32
                                                             java.lang.Thread.run                       Thread.java:  745
                               java.util.concurrent.ThreadPoolExecutor$Worker.run           ThreadPoolExecutor.java:  617
                                java.util.concurrent.ThreadPoolExecutor.runWorker           ThreadPoolExecutor.java: 1142
         java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run  ScheduledThreadPoolExecutor.java:  293
  java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201  ScheduledThreadPoolExecutor.java:  180
                                              java.util.concurrent.FutureTask.run                   FutureTask.java:  266
                              java.util.concurrent.Executors$RunnableAdapter.call                    Executors.java:  511
                      io.atomix.catalyst.concurrent.Runnables.lambda$logFailure$2                    Runnables.java:   20
io.atomix.catalyst.transport.netty.NettyConnection.lambda$handleResponseSuccess$5              NettyConnection.java:  207
                                  java.util.concurrent.CompletableFuture.complete            CompletableFuture.java: 1962
                              java.util.concurrent.CompletableFuture.postComplete            CompletableFuture.java:  474
                   java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire            CompletableFuture.java:  736
                           java.util.concurrent.CompletableFuture.uniWhenComplete            CompletableFuture.java:  760
              io.atomix.copycat.client.util.ClientConnection.lambda$sendRequest$9             ClientConnection.java:  124
                    io.atomix.copycat.client.util.ClientConnection.handleResponse             ClientConnection.java:  146
                                  java.util.concurrent.CompletableFuture.complete            CompletableFuture.java: 1962
                              java.util.concurrent.CompletableFuture.postComplete            CompletableFuture.java:  474
                   java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire            CompletableFuture.java:  736
                           java.util.concurrent.CompletableFuture.uniWhenComplete            CompletableFuture.java:  760
      io.atomix.copycat.client.session.ClientSessionSubmitter$QueryAttempt.accept       ClientSessionSubmitter.java:  319
      io.atomix.copycat.client.session.ClientSessionSubmitter$QueryAttempt.accept       ClientSessionSubmitter.java:  345
                      io.atomix.copycat.error.CopycatError$Type$6.createException                 CopycatError.java:  133
io.atomix.copycat.error.UnknownSessionException: unknown member session
    type: #object[io.atomix.copycat.error.CopycatError$Type$6 0x5299cc20 "UNKNOWN_SESSION_ERROR"]
Any idea what would cause this? Long running process just died when this popped up.
Roman Pearah
@neverfox
Jul 05 2016 15:41
Hmm same error as @niquola, but I'm on rc9
Max Lord
@maxl0rd
Jul 05 2016 16:27
Hi @kuujo ... I threw together a repo with an example that duplicates the pattern (and failure mode) that I was attempting to use in an application. We discussed briefly on Friday. Are you interested in taking a look? I think that it's more likely than anything that I am misusing the API in some way. But if you think it could be a bug, I can easily generate logs. Thx for any tips in advance. ..... https://github.com/wir35/atomix-messaging-example
Roman Pearah
@neverfox
Jul 05 2016 17:52
Actually, only my client was rc9
Jordan Halterman
@kuujo
Jul 05 2016 17:52
yep
Roman Pearah
@neverfox
Jul 05 2016 17:52
seems okay now with the replica at rc9 too, or else I got lucky
Jordan Halterman
@kuujo
Jul 05 2016 17:53
@niquola I’m actually going to be adding events for data structures very soon… I know they’re needed and are simple to implement anyways
Mikhail
@middlesphere
Jul 05 2016 18:53
@kuujo thanx! Another questions: 1) what is the max cluster size? (can I have 101 nodes? from doc's it is not clear) 2) Can Atomix pass Jepsen tests?
Jonathan Halterman
@jhalterman
Jul 05 2016 19:52
@middlesphere Re: 2, yes, Atomix has a Jepsen test suite: https://github.com/atomix/atomix-jepsen
...and it is shown to be linearizable under a variety of failure scenarios
@middlesphere Regarding cluster size, it can indeed be anything you want. We talk a little about cluster sizing here: http://atomix.io/atomix/docs/clustering/#determining-the-size-of-a-cluster and here http://atomix.io/atomix/docs/clustering/#replica-types
With a large cluster, you really only need a few nodes participating in the processing of write operations. These are your active replicas http://atomix.io/atomix/docs/clustering/#active-replicas
Your other nodes can be made to be passive or reserve replicas. They will still be accessible for read operations and in the case of passive replicas, will maintain a copy of your resource state, but will not participate in write operations (which would impact write performance)
Jonathan Halterman
@jhalterman
Jul 05 2016 19:59
@middlesphere Basically, you just want to be thoughtful about the role that the various nodes in your cluster are playing. How many nodes are needed to process write requests, how many nodes do you want to serve as passive replicas, etc. Having 101 active replicas is possible, but doesn't make much sense. Having 5 active replicas, maybe 5 passive nodes and the rest reserve, something like that makes more sense.
Since Atomix data is not partitioned, each active or passive node will contain an entire replica of all resource state in your cluster. So the benefit of numerous nodes isn't scalability, its fault tolerance.
Jordan Halterman
@kuujo
Jul 05 2016 20:12
@maxl0rd it looks totally fine to me. Has to be a bug in the state machine. The code that manages messaging is a little complex because of the intersection of persistent/non-persistent members and request-reply, publish-subscribe, ack/fail, etc. I'm sure I can reproduce it in a test on my side. My work with DistributedGroup messaging actually got delayed over the weekend but I have a new sprint starting tomorrow that I'm trying to get it in to.