These are chat archives for atomix/atomix

30th
Apr 2017
Jordan Halterman
@kuujo
Apr 30 2017 07:08
Wow that's really cool thanks!
I'll take a look at it more closely, but I actually thought about this topic a lot more on Friday and came to an interesting realization...
Jordan Halterman
@kuujo
Apr 30 2017 07:33

So, obviously I implemented the new thread model for ONOS. That thread model again is essentially: each partition of each primitive has a single thread. That guarantees order for responses/events coming from a single state machine.

When I was running some tests on one of our clusters in Friday, I realized a lot of the failures we see from the Copycat client end up cascading to other parts of the system. The reason for that is that a single Copycat client is still shared by all primitives for a given partition. That means a single Copycat client does request/response/event sequencing, connections, retries, etc for hundreds or thousands of different primitives. The problem with that is, in the event a single request is lost, it affects the performance of all other primitives for that partition. If an event is received out of sequence, events for all primitives are blocked, so responses for all primitives can also be blocked since the client guarantees the user sees responses and events occur in the order in which they occurred in state machines. In ONOS that means a failure in one application can affect the performance of other applications. Also, primitives that use SEQUENTIAL reads can slow down responses from leaders since the return of a leader's index forces a SEQUENTIAL read to wait for that entry from the leader before returning, and that can mean waiting for a heartbeat from the leader.

Anyways, what I realized was that the primitive threading model added to ONOS last week needs to be extended into the Copycat client. Essentially, what we need is for each partition of each primitive to have its own logical session that does request/response/event sequencing independently of all other partitions of all other primitives. This can be done by simply piggybacking multiple sessions on a single keep-alive request, more or less multiplexing logical sessions on a single physical session (it would be impractical/expensive to create actual sessions for each partition/primitive). That would have two benefits: First, because sequencing for one primitive occurs independently of sequencing for all other primitives, an out of sequence response/event in one primitive will not affect the performance of other primitives. So, the cascading timeouts we sometimes see in ONOS will be less likely to occur since primitives are effectively encapsulated from one another, and we still maintain the same exact guarantees since we've already relaxed the thread model in ONOS and order can only be guaranteed within a single thread anyways. Second, the read consistency guarantees of one primitive will no longer affect the performance of other primitives, which has all but eliminated their usefulness in ONOS. It will also increase concurrency in the client all the way down to Netty.

So, I'm planning on proposing the addition of a ThreadAwareCopycatClient that essentially associates ThreadContexts with logical sessions in the client to encapsulate ordering guarantees within the protocol across threads. In ONOS, we'll use the ThreadAwareCopycatClient and simply execute calls to the client on a separate ThreadContext for each partition of each primitive, encapsulating all the primitives' sessions from each other. That breaks ordering guarantees across primitives, but we already broke them in the new thread model. I expect it will make ONOS's primitives and the applications that use them much more resilient to failures in the protocol, should isolate failures when they do happen, and should provide a pretty substantial improvement in performance.