These are chat archives for atomix/atomix

12th
May 2016
Jonathan Halterman
@jhalterman
May 12 2016 00:23
@jsavlov Re: your message yesterday about the atomix-all jar (sorry I didn't see it till today), what ended up being the problem?
your pom.xml looked fine to me
Jordan Halterman
@kuujo
May 12 2016 07:56
I did some profiling tonight and committed some pretty significant performance improvements. I noticed a major bug that was causing the Cluster to load it's configuration from disk on every commit. I added options for better management of flushing buffers to disk. And I added a circular buffer for entries recently appended to the log. Leaders frequently append entries and then read them back from the log. This optimization holds them in memory so leaders rarely ever actually read from disk, and followers don't read from disk when doing consistency checks.
Richard Pijnenburg
@electrical
May 12 2016 07:56
Oh very nice!
loading config on each commit is a bit weird yeah :-)
because that’s also in memory right?
Kevin Daly
@kedaly
May 12 2016 17:21
Ok one more question, when new clients join an Atomix Cluster, do they automatically see all of the servers? Say for example I have 3 active servers and 3 passive servers, the clients connect to the 3 active servers, they all fail, will they fail over to the passive servers?
Even if those servers were not in the config?
And realistically how many clients can an atomix cluster scale to?
Roman Pearah
@neverfox
May 12 2016 17:25
I suppose you mean if the active ones fail all at once?
Kevin Daly
@kedaly
May 12 2016 17:25
Yes
Roman Pearah
@neverfox
May 12 2016 17:25
before the clients could learn of a promoted passive?
good question
Kevin Daly
@kedaly
May 12 2016 17:26
I guess if it's that important we'd have 5 or more active servers and point to all of them..
I just religiously design for failure scenereos
Jordan Halterman
@kuujo
May 12 2016 18:00
@kedaly yes clients will automatically find new servers/replicas. Technically, a client only needs to know about one live server in the cluster. Once the client connects, each time the client sends a keep-alive it will receive back an updated list of members. IIRC clients only learn about the active members of the cluster. So, in the event an active member fails and is replaced, we assume the client has additional active members to which it can connect to learn of the new active member. Indeed, this is necessarily true since a majority of the active members in the cluster must be alive to replace a failed active member. But this does men there's some time bound within which too many servers cannot fail otherwise a client can lose all the servers in the cluster. But this is true in general. If a majority of the active members in the cluster fail simultaneously they can't be replaced. In theory, if all the active nodes in a cluster were replaced in sequence fast enough that the client could not keep up with change notifications a client could lose the cluster, but in practice that's an insane scenario.
The overhead of a client is roughly equivalent to two writes per session timeout. So, by default that can be something like one write per second IIRC. In my tests last night I saw about 3k non-concurrent writes from a single client and 20k concurrent writes from a single client. So, it's likely Copycat/Atomix can ha doe thousands of clients. You can reduce the session timeout to reduce the overhead of a client's session at the expense of faster failure detection.
handle*
Kevin Daly
@kedaly
May 12 2016 18:07
Thanks Jordan, I'm building an application that uses Apache Ignite / Vert.x as a base for a replacement of our CDH Cluster (Hadoop is so yesterday ;) ) We're doing in memory streaming.. I think Atomix is a much better solution for our uses than zookeeper.. Thanks for the great work..
Jordan Halterman
@kuujo
May 12 2016 18:08
Awesome! I concur :-)
Kevin Daly
@kedaly
May 12 2016 18:09
What I'm working on is a pattern using DNS to locate Atomix clusters so that nodes can be launched in our VPC with Zero Config.. truly elastic!
Jordan Halterman
@kuujo
May 12 2016 18:12
I'm really interested to see how that works out... This is something that's been pretty frequently discussed and is an interesting challenge within the constraints of consensus
Jonathan Halterman
@jhalterman
May 12 2016 23:51
@kedaly Very cool sounding use case. Would love to see a blog post/writeup on how you make out. Blog posts on Atomix in general are very much welcome and appreciated! :)
@kuujo Another bit to document with respect to overhead in general, in addition to log usage, is what you mentioned above re: client session overhead.
Understanding the behind the scenes overhead, and particularly how it is effected and can be tweaked by Atomix's configuration, is a good thing for users.
Jonathan Halterman
@jhalterman
May 12 2016 23:57
@kuujo unrelated, i wonder what the rationale behind an "-all" jar being an uberjar is, as most -all jars i've seen are. useful for non-maven projects certainly, but otherwise the uberjar prevents users from overriding transitive dependencies (if they wish), though that's probably less of an issue with atomix having no external transitive dependencies aside from slf4j.
Jordan Halterman
@kuujo
May 12 2016 23:58
indeed
not sure
Jonathan Halterman
@jhalterman
May 12 2016 23:58
....if you're using atomix embedded, chances are you're using some maven repo based build thing. if you're not embedding, chances are you'd just use a standalone server?
which reminds, we still need to do a docker image at some point.
that passes through args and config to the standalone server