These are chat archives for atomix/atomix

16th
Mar 2018
Tej
@vvstej
Mar 16 2018 05:26 UTC
Frequently hitting java.lang.OutOfMemoryError: GC overhead limit exceeded with atomix
whats the recommended jvm heap size for my application to be bootstrapped as a atomix cluster
Jordan Halterman
@kuujo
Mar 16 2018 17:47 UTC
@vvstej what version are you using?
Jordan Halterman
@kuujo
Mar 16 2018 17:59 UTC
oh wait you’re using Atomix 1.x?
the stacktrace above references Copycat which has been removed in Atomix 2.x
It’s hard to tell in that case. There have been massive improvements in stability/efficiency since Atomix 1.x
Jordan Halterman
@kuujo
Mar 16 2018 20:12 UTC
FYI I’m back from all my trips now and hanging out at home for a while :-)
Tej
@vvstej
Mar 16 2018 21:27 UTC
@kuujo I thought 1.x is more stable
Jordan Halterman
@kuujo
Mar 16 2018 21:47 UTC

There was a point at which that was true, but it has long since passed. We’ve been using the Atomix 2.x Raft implementation for many months and are in the process of deploying it in some very large production networks.

The Raft implementation in Atomix 2.x was significantly refactored to be much more stable for its intended use case (distributed systems primitives). We rewrote the log to have a better memory footprint, replaced incremental compaction with snapshotting, rewrote the session abstraction to isolate primitives from each other, and fixed a ton of bugs in the process. It now has far more time and effort put into its stability than Atomix 1.x ever did.

Tej
@vvstej
Mar 16 2018 21:47 UTC
I see, that sounds very promising, do you think its ready for production usage?
Ive done a lot of performance testing of my cluster on 1.x and unfortunately have been seeing quite a few issues, so Im tending towards not proceeding towards 1.x direction
are there any current decent scale production use cases on 2.x? Even though it was never advertised to be a key value store, I am currently using atomix as distributed key val store using its primitive structures for some ephemeral data
Tej
@vvstej
Mar 16 2018 21:53 UTC
each item is of size 1KB and we will hold roughly 1000 items at any point of time, with lots of updates on those items in a 4 hour window. The perf roadblocks I hit are OutOfMemory issues/frequent brain freeze(nodes just look for each other forever)
Jordan Halterman
@kuujo
Mar 16 2018 21:54 UTC
I have actually been running scale tests all day today :-P
Tej
@vvstej
Mar 16 2018 21:54 UTC
sweet, hope results are good :)
Jordan Halterman
@kuujo
Mar 16 2018 21:55 UTC
How many nodes?
Since I have it setup already I can do some profiling for you
Tej
@vvstej
Mar 16 2018 21:55 UTC
3 node cluster,
Jordan Halterman
@kuujo
Mar 16 2018 21:55 UTC
K
Tej
@vvstej
Mar 16 2018 21:56 UTC
json blobs of about 1KB each and at least few hundred thousand of them
8 core machines, 2GB jvm heap size, while total RAM is 24GB. I now increased the heap size to 4GB and giving it a shot
150GB NFS disk
3 nodes across 2 data centers , so there is some network hopping too :(
Jordan Halterman
@kuujo
Mar 16 2018 22:04 UTC

Yeah that can be a problem. We’ve discussed a lot but haven’t tested how it behaves across geographies. The communication is configured to adapt to changing network conditions. It seems to work well when the network is heavily utilized, but just haven’t tested it with an unreliable inter-DC link.

Long term we’re likely to implement a gossip protocol to handle this type of replication between data centers, but we have no immediate need for it so it’s not even in any release plans yet.

Jordan Halterman
@kuujo
Mar 16 2018 22:11 UTC
anyways… running the test now
Tej
@vvstej
Mar 16 2018 22:12 UTC
I would have to find out how much of geographical separation does our DCs have, but its good to know about this limitation
@kuujo any production use cases for 2.x yet?
and also whats your recommendation for using Atomix as the ephemeral store I was talking about above
Jordan Halterman
@kuujo
Mar 16 2018 22:21 UTC

The only use cases I know about and pay attention to are the ones I work on for ONOS :-) Like I said, we’re in the process of a very large production deployment, but I don’t know where others are at in deploying it nor what the scale is.

My recommendation for ephemeral state depends on your consistency requirements. For a read-heavy workload consensus is perfectly fine - data is held in memory. For write heavy workloads, Atomix 2.x is a lot faster since it’s sharded internally, but there are better protocols if you don’t need the consistency. That’s the reason Atomix 2.x has a second primary-backup or multi-primary protocol. Primitives can be partitioned into many more partitions, stores only in memory, and replicated synchronously or asynchronously or to fewer nodes.

Tej
@vvstej
Mar 16 2018 23:35 UTC
consistency is key in our case, hence went the route of exploring Atomix
We are ok with consistency lag, but definitely not missing writes
Jordan Halterman
@kuujo
Mar 16 2018 23:44 UTC
so I’m running a 3 node cluster with 4G of memory, continuously writing a million keys from 8 clients
Tej
@vvstej
Mar 16 2018 23:45 UTC
is this with 2.x?
hows the performance?
Jordan Halterman
@kuujo
Mar 16 2018 23:45 UTC
yeah
about 8k writes/sec ATM
starting to put some pressure on GC but no real issues
this is in LXC containers… don’t remember how they’re configured though
Tej
@vvstej
Mar 16 2018 23:46 UTC
I see .. ok
also is the write/read consistency set to Atomic or Sequential?
Jordan Halterman
@kuujo
Mar 16 2018 23:49 UTC
SEQUENTIAL but since I’m not reading it doesn’t matter
Tej
@vvstej
Mar 16 2018 23:50 UTC
so what I noticed is frequent reads of hundreds of items with ATOMIC reads caused a significant perf impact, the compacting processes had to work a lot :)
Jordan Halterman
@kuujo
Mar 16 2018 23:52 UTC
this test can be configured with different read/write ratios and consistency levels
total number of keys, key/value size, total number of distinct values, whether to listen for events, etc
Tej
@vvstej
Mar 16 2018 23:52 UTC
cool
Jordan Halterman
@kuujo
Mar 16 2018 23:53 UTC
1 million 1K entries consumes 1G of memory just for the map state itself
I can run these benchmarks on our bare metal cluster
haven’t done it in a while