These are chat archives for atomix/atomix

7th
Sep 2018
suimi
@suimi
Sep 07 2018 03:12
@kuujo how to exit the cluster? there have the leave operation at Raft server, but not used at atomix cluster level
have any plans?
Jordan Halterman
@kuujo
Sep 07 2018 07:51
@suimi depends on what you mean by exit the cluster. You want to remove a node from the cluster, in other words affecting the size of the Raft quorum? Or you just want to shut down a node?
Currently, resizing Raft partition groups is not supported for a variety of very complicated reasons
It will be supported some time in the future, but not high on the list of priorities ATM
@mmanco are stateful sets working well for you? I seem to be running into kubernetes/kubernetes#45779, which is I think the same problem some of our other teams have run into as well
suimi
@suimi
Sep 07 2018 07:54
@kuujo got it, thanks
Jordan Halterman
@kuujo
Sep 07 2018 08:06
It’s actually fairly straightforward to expose in a Java API. The more complicated part is how we do it for agents. I guess we could just do it through the REST API, but that’s probably a whole rabbit hole - allowing nodes to be reconfigured at runtime - for Atomix 3.1.x.
Hmm actually maybe there’s an easy way to support it. Maybe I’ll look into it when I get a chance
Maxim Manco
@mmanco
Sep 07 2018 11:33
@kuujo Works smoothly so far
I am using an headless service
Each instance is waiting for all hosts of the bootstrap list to become available. sort of bootsrap validation. once all hosts are available cluster can start forming
Maxim Manco
@mmanco
Sep 07 2018 11:38
This is how I approached it and I don't see any issues
Maxim Manco
@mmanco
Sep 07 2018 12:37
The thing is that DNS records are updated only when a pod is ready. till then hostname of the pod is not resolvable. This forces us to defer cluster start (atomix.start()) to the point all hosts are resolvable.
Luca Burgazzoli
@lburgazzoli
Sep 07 2018 12:38
I'm using DNS+SRV discovery in k8s
with a custom discovery provider
that queries dns every now and then and update the list of nodes
Maxim Manco
@mmanco
Sep 07 2018 13:18
@lburgazzoli at what stage you are doing the discovery?
Luca Burgazzoli
@lburgazzoli
Sep 07 2018 13:48
it is driven by the api
Maxim Manco
@mmanco
Sep 07 2018 14:08
I see, great effort to make Atomix available to Spring based applications!
Jordan Halterman
@kuujo
Sep 07 2018 18:45

@mmanco

Each instance is waiting for all hosts of the bootstrap list to become available

I think this is what I need. How did you go about doing this?

We built it in to ONOS to do it in that case - blocking startup until a configuration with the hostnames becomes available - but it seems like you could take a few different approaches
Jordan Halterman
@kuujo
Sep 07 2018 18:56
maybe use an init container?
Jordan Halterman
@kuujo
Sep 07 2018 19:49
actually that probably wouldn’t work
Jordan Halterman
@kuujo
Sep 07 2018 21:26
I actually seem to just be having DNS issues in general
Maxim Manco
@mmanco
Sep 07 2018 23:37
The key is not to block container start since DNS will be updated only at the point the container is up and pod signaling a ready state. I can share the code snippet that does that on Monday when I am back to work. but, It's really simple. just poll bootstrap nodes for reachability on a separate thread. once all reachable call atomix.start
Had to go this route since Address will throw an exception in case the host is not reachable
Maxim Manco
@mmanco
Sep 07 2018 23:57
But, I think the correct approach will be a custom BootstrapProvider. Something similar to what @lburgazzoli is working on