These are chat archives for atomix/atomix

25th
Mar 2016
Jordan Halterman
@kuujo
Mar 25 2016 17:25
It does seem like there may be a memory leak on the leader
hmm maybe not
I do see their memory drop a lot after log compaction
going to run them all day with MEMORY logging and see what happens
whoah
Jordan Halterman
@kuujo
Mar 25 2016 17:35
So... I'm running tests on Copycat with MEMORY storage. It seems generally fine so far, though I haven't run them long enough. But something I did see is when I killed and restarted a node (meaning it had to recover from the leader since all state was lost) I saw the leader's memory usage jump to > 1GB.
Richard Pijnenburg
@electrical
Mar 25 2016 17:38
that can’t be good..
Jordan Halterman
@kuujo
Mar 25 2016 17:39
couldn’t reproduce it a second time
it’s sort of cheating running in with MEMORY storage to find bugs but still probably useful
Richard Pijnenburg
@electrical
Mar 25 2016 17:41
hehe yeah
Richard Pijnenburg
@electrical
Mar 25 2016 17:57
making things fail can be easy. making things fail consistently s o you can fix it is hard :-)
Jordan Halterman
@kuujo
Mar 25 2016 17:57
haha indeed
actually I just can’t get YourKit to work so that’s not helping :-(
Richard Pijnenburg
@electrical
Mar 25 2016 18:12
never used it. what does it do? :-)
Jordan Halterman
@kuujo
Mar 25 2016 18:12
it’s a profiler
inspect all the memory in the JVM, find memory leaks, find expensive codes, etc
Richard Pijnenburg
@electrical
Mar 25 2016 18:13
ahhh okay. nice :-)
Jordan Halterman
@kuujo
Mar 25 2016 18:13
it’s really extensive but not working outside of my IDE :-(
Richard Pijnenburg
@electrical
Mar 25 2016 18:13
annoying
Jonathan Halterman
@jhalterman
Mar 25 2016 19:22
@kuujo there should be a way to attach yourkit to a running process without the IDE
if that's helpful
Jordan Halterman
@kuujo
Mar 25 2016 19:22
I know. That is what doesn't work
Jonathan Halterman
@jhalterman
Mar 25 2016 19:22
bummer
Jordan Halterman
@kuujo
Mar 25 2016 19:22
Sometimes it says no process running at localhost
Sometimes it causes a fault and core dump in the JVM
I need to figure out how it's run in IntelliJ which works perfectly fine
Jonathan Halterman
@jhalterman
Mar 25 2016 19:24
can you see the process with JConsole?
Jordan Halterman
@kuujo
Mar 25 2016 19:26
Yeah and I can see it in YourKit
Even running it with the YourKit agent gets the same result
Jordan Halterman
@kuujo
Mar 25 2016 19:34
can connect through jconsole
haha
was able to connect to one of them
Jordan Halterman
@kuujo
Mar 25 2016 19:41
…and then crashed
ahh I figured it out
something to do with the port
when I start it with the agent and specify the port it works
java -jar -agentpath:/Applications/YourKit.app/bin/mac/libyjpagent.jnilib=port=8080 target/value-state-machine.jar logs/server1 localhost:5000 localhost:5001 localhost:5002
Jordan Halterman
@kuujo
Mar 25 2016 20:10
ahh totally reproduced it in YourKit
Jonathan Halterman
@jhalterman
Mar 25 2016 20:10
:thumbsup:
do tell
Jordan Halterman
@kuujo
Mar 25 2016 20:15
Memory
Jonathan Halterman
@jhalterman
Mar 25 2016 20:16
this is using Storage.MEMORY?
that's off heap?
er, storagelelevel.memory
Jordan Halterman
@kuujo
Mar 25 2016 20:18
On heap
I need to create separate off/on heap storage levels
Anyways... That's what happens on the leader when the cluster has been running for a long time and I kill and restart a node. the leader is catching up the new server and its memory explodes.
Need to test it with persistent storage too
But other than that, I wasn't able to reproduce any general issue with memory being increasingly consumed by MEMORY storage
It seems to manage itself pretty well
I ran it with ~750 writes/sec for an hour on memory storage
Maybe that issue is specific to Atomix though
Madan Jampani
@madjam
Mar 25 2016 20:22
@kuujo which objects are taking up space on the heap?
Jordan Halterman
@kuujo
Mar 25 2016 20:23
I took a memory snapshot too late :-( but I didn't see anything obvious. Just the buffers in the log
Gonna try again and take a more useful snapshot
Madan Jampani
@madjam
Mar 25 2016 20:23
also what is the underlying resource state machine?
Jonathan Halterman
@jhalterman
Mar 25 2016 20:24
the log should reach equilibrium at some point where writes and cleaning don't really effect things much either way, right?
Jordan Halterman
@kuujo
Mar 25 2016 20:24
I am just running the value-state-machine example in Copycat's repo
memory eventually went back to normal since my snapshot forced another leader to be elected
gonna try again
ConcurrentSkipListMap$Node
Jordan Halterman
@kuujo
Mar 25 2016 20:30
ahh see that’s just SegmentManager’s map
doesn’t really make sense… hmm
Madan Jampani
@madjam
Mar 25 2016 20:31
how many objects are there of that type?
Jordan Halterman
@kuujo
Mar 25 2016 20:36
yeah see 31
Jordan Halterman
@kuujo
Mar 25 2016 21:46
hmmm
Jordan Halterman
@kuujo
Mar 25 2016 22:03
So, in reality the memory actually jumps when the node is killed, not when it’s recovering. It remains pretty high. A lot of objects are allocated and then garbage collected while the node is down, which makes sense. When I restart the node the memory usage flattens again
Jordan Halterman
@kuujo
Mar 25 2016 22:13
I haven’t really found any obvious issues. I think I’ll get back to trying to break things
Jonathan Halterman
@jhalterman
Mar 25 2016 22:51
:thumbsup: