These are chat archives for atomix/atomix

30th
Jun 2016
niquola
@niquola
Jun 30 2016 10:45
Hello, i'm new to atomix. How could i get copycat cluster information for atomix replicas?
niquola
@niquola
Jun 30 2016 11:10
Or why it's private? Or may be i'm wanting something wrong?
Jordan Halterman
@kuujo
Jun 30 2016 18:28
What type of information do you want? We can expose it if it makes sense. The reason it's hidden is because calling some of the server or Cluster methods directly could mess up the replica. So, it depends on what we need to expose and why, and maybe we can do it.
niquola
@niquola
Jun 30 2016 19:51
I want to create distributed supervisor for HA postgresql and i need to read and report cluster state
I've accessed cluster thro reflection :)
I'm quite new to distributed algorithms, could you advise me what is a best way to acquire lock for master. Consul has TTL for lock and locking some keys by session - could i simulate it somehow using atomix? Or there is another patterns to hold master status in such type of systems?
niquola
@niquola
Jun 30 2016 19:56
Another useful consul feature is health checks - what is a best way to implement this in atomix?
I have a lot of questions - hope not bother you too much :)
Another concern is bootstrapping and restarting agents - as i understand atomix persist cluster state and reuse it after restart? So i need bootstrap and join only at initialization time? Could i get persisted cluster state at the agent start? How can i deregister dead replica in cluster?
niquola
@niquola
Jun 30 2016 20:02
Could you provide some links on similar projects?
niquola
@niquola
Jun 30 2016 21:03
Basic example project with all infrastructure stuff included (like logging, bootstrapping, joining, restarting) would be helpful. I hope, if i'll got it - i may contribute.
Jordan Halterman
@kuujo
Jun 30 2016 22:01
sorry I was driving...

I’ve accessed the cluster thro reflection

Gotta do what you gotta do :-P

Jordan Halterman
@kuujo
Jun 30 2016 22:11

Another concern is bootstrapping and restarting agents - as i understand atomix persist cluster state and reuse it after restart? So i need bootstrap and join only at initialization time? Could i get persisted cluster state at the agent start? How can i deregister dead replica in cluster?

This is correct if you’re using StorageLevel.DISK or StorageLevel.MAPPED. The cluster configuration is stored on each node in the .metadata file. You can actually read it directly from the Storage object:

Storage storage = Storage.builder(StorageLevel.DISK)…build();
MetaStore metaStore = storage.openMetaStore(serverName);
MetaStore.Configuration configuration = metaStore.loadConfiguration();

But the proper way to access it would be through the Copycat Cluster accessible via CopycatServer#cluster(). The Cluster object is actually initialized when the server is created as opposed to when it’s started, so I think you should be able to read the configuration prior to calling bootstrap or join via the server’s Cluster.

Regardless of the fact that the configuration is persisted, though, you still need to call either bootstrap or join to recover a failed node, the method is just idempotent. If a server is always started via bootstrap, it will only ever actually bootstrap the cluster if there is no persisted configuration. Subsequent calls to bootstrap will just cause the server to start. Whether this consistency makes it easier or more difficult to use depends on how people use it.

You can also remove a failed node from the cluster via the Cluster API

server.cluster().member(someMemberIdOrAddress).remove();

Similarly, you can promote() or demote() a member remotely. The way this is done in Atomix is by using the Clusters leader election and calling remove/promote/demote only from the leader. Internally, Copycat will ensure that changes to the cluster configuration can only be made by a node that has an up-to-date configuration, so if you use the leader election API to make changes from the leader it will be done safely. Even if two leaders exist and the older leader attempts to remove another node from the cluster, that operation will fail since it has been superseded by a newer leader.

@niquola ^^
I’m not really opposed to exposing the CopycatServer or the Cluster in Atomix.
niquola
@niquola
Jun 30 2016 22:17
Thank you for detailed answers! :+1: Don't you think ttl for lock is good idea for liveness of system?
Jordan Halterman
@kuujo
Jun 30 2016 22:19
Well, in Atomix’s DistributedLock there is still an effective timeout and thus a liveness guarantee based on session timeouts. If a client’s session expires after acquiring a lock, the lock will be released. So, effectively the session expiration mechanism behaves like a lease on the lock that’s renewed each time the client sends a keep-alive to the cluster.
Copycat will call close(ServerSession) on the state machine when a client’s session is unregistered or is expired by the leader, and the lock state machine releases the lock if it’s associated with that session: https://github.com/atomix/atomix/blob/master/concurrent/src/main/java/io/atomix/concurrent/internal/LockState.java#L42-L59
niquola
@niquola
Jun 30 2016 22:21
Cool, could i configure session timeout?
Jordan Halterman
@kuujo
Jun 30 2016 22:21
yep… it’s in the CopycatServer.Builder or AtomixReplica.Builder methods
actually need to move it to the client though
forgot to open an issue for that thanks for reminding me :-)
withSessionTimeout is the method you want
niquola
@niquola
Jun 30 2016 22:24
And again thank you for help. Atomix is amazing! Hope my adventure will end up with useful solution :)
Jordan Halterman
@kuujo
Jun 30 2016 22:25
currently, the session timeout is set on servers. Whichever node is the leader at the time a session is registered will send its session timeout back to the client, and the client will send keep-alives to the cluster at a rate of .5 * sessionTimeout
np I’m always here to answer questions except when I’m driving I guess :-P
niquola
@niquola
Jun 30 2016 22:26
It was strange for me, that i could not find any talks about atomix on conferences :(
Gonna fix it on our next clojure meetup
Jordan Halterman
@kuujo
Jun 30 2016 22:33
yeah… I have been lazy. I have only submitted a talk to one conference and that one was not selected. But I’m actually giving a recorded talk on it pretty soon that I want to put online.
Jordan Halterman
@kuujo
Jun 30 2016 22:41
but talks would be great!
@jhalterman is a Clojure fanatic
that’s, like, all he ever talks about
haha