These are chat archives for atomix/atomix

30th
Aug 2018
william.z
@zwillim
Aug 30 2018 03:02
If I want my cluster to detect new nodes, it seems I should use MulticastDiscoveryProvider. In my opinion, new node should send a msg TO somewhere, and one or more exist node should ACCEPT the msg. I'm confused cause I cannot find such description like a server-client in document or in the MulticastDiscoveryProvider class.
Is there a demo or something about this?
Johno Crawford
@johnou
Aug 30 2018 06:23
@kuujo what API
Jordan Halterman
@kuujo
Aug 30 2018 06:42
@zwillim I’m not sure what you mean. New nodes do broadcast a message to notify other nodes of their existence. Both discovery providers work that way: the multicast provider uses multicast to broadcast a node’s information to any listeners, and the bootstrap provider sends a TCP message to the bootstrap nodes to tell them about itself.
william.z
@zwillim
Aug 30 2018 07:20
@kuujo so, how does a new node know where the listeners are? and does all nodes default listen to such msg or I need to config somewhere?
Jordan Halterman
@kuujo
Aug 30 2018 07:20
In the case of multicast that’s handled by the multicast protocol. In the case of BootstrapDiscoveryProvider you’re providing a list of nodes to notify
william.z
@zwillim
Aug 30 2018 07:21
oh, it means i need to use BootstrapDiscoveryProvider at the same time ?
Jordan Halterman
@kuujo
Aug 30 2018 07:21
currently you can only configure one or the other, although I think there’s probably an argument to be made for allowing both to be used at the same time
So if you’re using MulticastDiscoveryProvider then the multicast protocol is used to notify other nodes of the new node’s existence. Once those nodes learn of a new node, they connect back to the new node over TCP and tell it about the rest of the cluster.
If you’re using BootstrapDiscoveryProvider, the new node connects to the bootstrap nodes and sends a message telling them about it, then those nodes send a message back telling the new node about all the other nodes.
multicast is nice because it doesn’t require nodes to know about each other, but many networks don’t even have multicast enabled
william.z
@zwillim
Aug 30 2018 07:29

i'm still confused.. cause I think as a new node, I need a place that I can config to whom I can register myself
for example, I config Server1 as the broadcast node, and new node would register itself to Server1, and Server1 would introduce the new node to the cluster.

let's say like this:

if I have a running cluster : 192.168.10.1:1111 , 192.168.10.2:1111
and a new node 192.168.10.3:1111 want to join the cluster

what shoud I do to the existing nodes and new nodes?

or maybe you means the MulticastDiscoveryProvider use the broadcast of the network (like 192.168.10.255) to communicate to each other?
Jordan Halterman
@kuujo
Aug 30 2018 07:35
Yes. The configuration you’re doing is the discovery provider. If all the nodes are configured to use MulticastDiscoveryProvider then they will discover each other over multicast. You’re configuring a multicast group instead of specific node information, and the multicast protocol is responsible for nodes finding each other. A joining node broadcasts its information to a multicast group rather than any specific node, and the other nodes listen for join messages sent to that multicast group.
But if you use the BootstrapDiscoveryProvider then you are configuring specific nodes to initially join when you construct the provider via BootstrapDiscoveryProvider.builder().withNodes(…).build()
So the new node in that case configures a BootstrapDiscoveryProvider with the Addresses of the nodes that are already running, i.e. 192.168.10.1:1111 and 192.168.10.2:1111
This is the multicast implementation used to send messages to a multicast group
And this is the TCP implementation used to send messages directly to specific nodes
william.z
@zwillim
Aug 30 2018 07:40
wow, I got it. That's great! It is not only TCP used in this protocol
Jordan Halterman
@kuujo
Aug 30 2018 07:40
yep
in the future we could also do something like IP ranges for a purpose similar to multicast
william.z
@zwillim
Aug 30 2018 07:43
great feature! It's quit convenient
and in the MulticastDiscoveryProvider, what does withMulticastAddress("230.0.0.1:54321") mean? does it mean 'I am 230.0.0.1:54321' ?
but it should be withAddress
william.z
@zwillim
Aug 30 2018 08:02
Oh, that is just Multicast Address..
william.z
@zwillim
Aug 30 2018 11:14
When I'm using MulticastDiscoveryProvider, I still need to specify members to use RaftPartitionGroup. Is there a way to automatically discover the member? Further more, is there a good way to build a cluster with nodes added and removed frequently?
Jordan Halterman
@kuujo
Aug 30 2018 19:16

@zwillim these are good questions.

The Raft partition group requires member IDs to be provided because it’s a requirement of consensus. In order to safely bootstrap a consensus (Raft) cluster, the members of the cluster must be static and known by all nodes. So, this can unfortunately limit the usefulness of dynamic clustering. But here’s why this is so...

One of the primary goals of consensus is to ensure a single view of the system’s state especially during network partitions. No matter what node or network failure occurs, Raft will maintain a single copy of the system’s state. Once an operation has been completed on a Raft partition, it’s guaranteed to be successful and persisted forever, no matter what happens to the cluster.

In order to provide this guarantee and avoid split brain - when two portions of the cluster diverge from one another - all the Raft partitions must know which other nodes are participating in the cluster. If they don’t know which other nodes participate in the cluster, they have no way to count votes or determine who won an election.

For example, imagine you start a cluster of three Raft nodes and there’s a network partition. Two nodes are on one side of the partition and one is on the other. The single node sees only itself in the cluster, so it starts and wins an election by voting for itself. The other two nodes see each other and start an election and one of them wins. You now have two leaders in the first term, i.e. split brain. Writes will succeed to either side of the partition, and when the partition heals the nodes will find each other and ultimately elect a single leader. Even after a single leader is elected, you’ll see different views of the system’s state depending on which node you’re reading, and eventually one of the sides of the partition will be overwritten by the other.

So, this is what we can get when Raft nodes try to dynamically discover each other. But actually we can get the same level of consistency using much more efficient algorithms (e.g. the primary-backup algorithm). So, if dynamic clustering for the entire cluster is truly desirable, I’d suggest using the primary-backup algorithm. It will be faster and provide the same guarantees a dynamic Raft cluster would.

But that doesn’t make dynamic discovery useless in the context of Raft. Only nodes that participate in the Raft protocol need to know about each other. Other “client” nodes don’t have to know about the Raft nodes. This is effectively what we do in our application of Atomix: setup a cluster of Atomix agents configured with a Raft partition group, then configure client nodes to discover the Raft nodes and where the Raft partitions live. The client nodes then discovery each other through the Raft nodes, and we maintain the strong consistency guarantees of Raft.

There are many different ways you can configure the cluster. The way we do it I think has the best trade offs for configuration, performance, scalability, and consistency