These are chat archives for atomix/atomix
So, when you create a new Raft primitive you’re opening a new logical session. The Raft implementation is optimized for many primitives to be active at the same time. Essentially, each node has a Raft client for each Raft partition, and all primitives share the same Raft clients. This means the overhead of many primitives is fairly low. Each primitive has a distinct Raft session, but the client will piggyback all the keep-alives for each session in a single request. Basically, what you’re adding when you open a new session is up to 16 bytes on existing keep-alives. So, if you keep creating primitives you will have a very slow memory leak and slowly overload the network, but it will take a long time.
But all this means that the cluster can handle a large number of primitives. ONOS probably has hundreds of active primitives and in large clusters thousands of sessions that it never closes. You’ll still only have a single keep-alive per-partition per-node, with each keep-alive containing metadata (Raft log/event indexes) for each active session.
To answer your other questions, if a node goes down the primitives’ sessions will be expired and the state machines will treat them the same way as if they were closed. For example, if you acquire a lock and then the node crashes, the lock primitive’s session will expire and the lock will be released.
There are some APIs on
Atomix that will show registered primitives, but not active primitive sessions IIRC. Seems like that could be useful, though, especially as a metric that can be read via the REST API.
In other news, sorry I haven’t gotten to PRs yet today. I have been finishing the SWIM spec, but it’s pretty good now:
Probably will finish up the implementation tonight