These are chat archives for atomix/atomix

Jan 2018
Johno Crawford
Jan 03 2018 00:46
@kuujo only six more ;)
Jordan Halterman
Jan 03 2018 01:32

If you have three nodes and want to add a fourth, just start the fourth with the first 3 nodes as the bootstrap nodes. The new configuration will be replicated and existing partitions will be automatically rebalanced. That’s what the test I added does. The only caveat is that the number of Raft partitions cannot change, which is what’s broken here.

The first two are opposed to each other. If we replicate the number of partitions, then the number of partitions is essentially based on the initial cluster configuration. If you start a three node cluster with three partitions, then adding a fourth won’t change the number of partitions and the user doesn’t have to worry about the configuration so much. Alternatively, if the number of partitions is a required configuration that’s not based on the initial cluster configuration, then it’s the responsibility of the user to ensure all the nodes are configured with the same number of partitions.

I think what may make sense is this:
• The number of Raft partitions should be replicated in the cluster metadata
• In order to support the use case where a single node is used to bootstrap a larger cluster, the default number of partitions for a 1-node cluster should be > 1
• But the number of partitions for a >1-node cluster should be n*2 unless otherwise specified

This allows a cluster to be bootstrapped from a single node without creating a single-partition cluster, allows the number of partitions to be customized, and prevent partition configurations in new nodes from conflicting with the actual number of partitions, which is the problem with manual configuration

I guess the difference between the two is the difference between the system ensuring there’s a consistent number of partitions and the user ensuring there is. But it can be a mixture of both: the user can specify partitions initially and the system can enforce the number of partitions when nodes are added
Johno Crawford
Jan 03 2018 01:40
Makes sense, sounds like a good approach to me.
Paweł Kamiński
Jan 03 2018 10:58
@kuujo so I understand this explains why nodes added later to cluster with defaults hang on start? could atomix at least throw some exception with detail message why it cannot join? I guess such cases where new nodes come with different config will be very common as users learn how to configure the cluster and shutting down whole cluster to change something is lame!
Paweł Kamiński
Jan 03 2018 11:06
in perfect world it would be great that atomix verify (potentially changed) configuration (of joining node) against running cluster and fails if something is misconfigured
Johno Crawford
Jan 03 2018 11:39
@pawel-kaminski-krk I think that should be dealt with when adding support for re-partitioning, once atomix/atomix#374 is pulled in it should be pretty solid.