These are chat archives for atomix/atomix

28th
Apr 2018
Jordan Halterman
@kuujo
Apr 28 2018 00:05
Rebalancing Raft partitions and resizing partitions are not implemented. Rebalancing partitions is easy to implement. Resizing probably not so much, and there may not be much benefit to resizing partitions for consensus anyways. Rebalancing them is where the money’s at
Johno Crawford
@johnou
Apr 28 2018 00:07
so given a partition with 3 backups and 10 nodes, and 1 of those nodes is shut down it will pick another backup?
Jordan Halterman
@kuujo
Apr 28 2018 00:08
yeah
I just removed it to do the partition group discovery stuff
now it can be added back in though
Jordan Halterman
@kuujo
Apr 28 2018 00:14
we’d have to build some sort of partition group membership API so the partition group knows which members have joined the group
that’s why I didn’t do it yet
Johno Crawford
@johnou
Apr 28 2018 00:14
yeah makes sense
so I diff'd ONOS netty messaging and atomix
found that ONOS also invokes localClientConnection.timeoutCallbacks(); inside timeoutAllCallbacks
is that the leak you mentioned
Jordan Halterman
@kuujo
Apr 28 2018 00:18
no it’s in the connection pool
hmm maybe I did already copy it over actually
oh yeah I did
Johno Crawford
@johnou
Apr 28 2018 00:19
do we want that timeout callbacks method call too
or was that omitted on purpose
Jordan Halterman
@kuujo
Apr 28 2018 00:20
we probably want it
Johno Crawford
@johnou
Apr 28 2018 00:21
something must be seriously wrong if a local op is failing though
maybe it was removed because of the phi timeout false positives
Jordan Halterman
@kuujo
Apr 28 2018 00:22
well, the problem is when a local op proxies a request to a remote server
Johno Crawford
@johnou
Apr 28 2018 00:23
oh right
Jordan Halterman
@kuujo
Apr 28 2018 00:23
so it’s not so much that a local op hangs, it’s that a local op makes a remote call that’s expensive
Timeout local messages in NettyMessagingManager to avoid hanging when receivers are blocked on external calls.
need to remember what other stuff I fixed in ONOS from the field trials
not much else that applies to Atomix itself
already got the lock changes
Jordan Halterman
@kuujo
Apr 28 2018 00:32
I think I can actually reuse a lot of the DefaultPartitionService code to make a new PartitionGroupMembershipService that replicates the members in a partition group to make balancing Raft partitions possible
Jordan Halterman
@kuujo
Apr 28 2018 01:08
easy :-)
Johno Crawford
@johnou
Apr 28 2018 09:50
that sounds like a win