These are chat archives for atomix/atomix

5th
Oct 2018
Jordan Halterman
@kuujo
Oct 05 2018 03:34

I think this work is probably going to have to be part of the 3.1 milestone, which should be targeted for December. I’m going to add a few issues for it.

Basically, the way NodeDiscoveryProviders are currently implemented is redundant. Both the discovery providers and the membership service use pretty similar protocols. Both detect failures. I’m getting rid of that redundancy first. The membership service and the membership service alone should decide when a node is no longer a member of the cluster. The providers will become a lot simpler - can just provide a static list of nodes - and this will pave the way for a DnsDiscoveryProvider.

The other component of the 3.1 cluster membership work would be making the protocol pluggable and adding the SWIM protocol. I have should have the initial implementation of that protocol done now, and it will mean the cluster can be configured to either use the gossip/phi accrual membership protocol or the more scalable SWIM protocol.

cluster {
  discovery {
    type: dns
    service: some.dns.service
  }

  protocol {
    type: swim
    probeInterval: 100ms
    failureProbes: 3
  }
}
The hard part is maintaining backwards/forwards compatibility for the membership protocols. That has to be done carefully so we can maintain rolling upgrades and remove that code in 3.2
The other changes planned for 3.1 are a new messaging protocol that includes a handshake for protocol version negotiation and the new Raft log which has been done for a while.
Luca Burgazzoli
@lburgazzoli
Oct 05 2018 06:17
yeah, the NodeDiscoveryProviders I wrote was a copy and paste with some modification of some other discovery imp I found, was not sure to have done it right but was more a proof oc concept
looking forward to have something similar natively supported
Luca Burgazzoli
@lburgazzoli
Oct 05 2018 07:34
@kuujo but the provider should still have a way to refresh the list of nodes correct ?
Mike Hearn
@mikehearn
Oct 05 2018 12:14
thanks kuujo
Jordan Halterman
@kuujo
Oct 05 2018 19:10
@lburgazzoli yeah the provider interface will stay the same, it just won’t have to refresh the list of nodes. For example, the bootstrap provider can just return a list of Nodes, and the GroupMembershipProtocols (phi and swim) will be capable of propagating new nodes without that having to be done in the provider. This allows both a push and pull model.
The other thing is we should probably allow multiple providers, e.g. so multicast can be used in conjunction with the bootstrap provider
Going to add some issues and a 3.1 milestone for this right now