These are chat archives for ManageIQ/manageiq/performance

29th
Nov 2017
Joe Rafaniello
@jrafanie
Nov 29 2017 18:52
@dmetzger57 I'm looking the appliances you pointed me at this morning... it's currently 785 MB RSS, right?
Dennis Metzger
@dmetzger57
Nov 29 2017 18:53
yes
had to log in to verify
Joe Rafaniello
@jrafanie
Nov 29 2017 18:54
there are 23 workers and 24 drb file descriptors so that makes sense (one for each client and one for the server)
lsof -p 13208 | grep 35525 | wc -l
24
Can we start a new appliance with the same scenario and @NickLaMuro's wonderous initializer script?
Jason Frey
@Fryguy
Nov 29 2017 18:56
before that...how long does it take for it to get into the high memory? like hours or days?
Dennis Metzger
@dmetzger57
Nov 29 2017 18:57
mine reached 785Mb in 2 days, with a 400 VM provider and C&U
Dennis Metzger
@dmetzger57
Nov 29 2017 19:07
@jrafanie one of @NickLaMuro applliances is showing the growth also, so I'm thinking that’s a great place for the wonderous initializer script” too
Joe Rafaniello
@jrafanie
Nov 29 2017 19:09
Cool, please do it ;-)
I was chatting with Gregg... the idea behind the initializer is that we know what the OS is saying in terms of the size of the ruby heap... we need to see what ruby "thinks" is happening in it's heap... if the number of live objects grows at the same rate as the RSS growth, what types of objects, etc. ... and the thread count because we all like soft pillows ;-)
Joe Rafaniello
@jrafanie
Nov 29 2017 19:15
@dmetzger57 do you remember what's a normal number of threads for the server process?
this process has 49 threads
# ls -1 /proc/13208/task | wc -l
49
Oleg Barenboim
@chessbyte
Nov 29 2017 19:15
49 threads??
DRb
Joe Rafaniello
@jrafanie
Nov 29 2017 19:15
haha
Dennis Metzger
@dmetzger57
Nov 29 2017 19:16
who said the DRb word
Oleg Barenboim
@chessbyte
Nov 29 2017 19:16
hoping re-arch eliminates DRb from the picture
Joe Rafaniello
@jrafanie
Nov 29 2017 19:16
well, we have 23 drb clients (workers) so I was expecting in the 25-35 range
Dennis Metzger
@dmetzger57
Nov 29 2017 19:17
The DRb “word” keeps coming up in “leak” conversations, never gets a stake driven through it though
Oleg Barenboim
@chessbyte
Nov 29 2017 19:17
it is a toy that we abused
Nick LaMuro
@NickLaMuro
Nov 29 2017 19:42
sorry, was playing chat-catchup from coming back after lunch
I am going to enrich the appliance with the initializer for MIQServer, as well as instrument a hopefully "every 3 hour timer" for doing a memory dump as well on the appliance we have replicating this issue
Joe Rafaniello
@jrafanie
Nov 29 2017 20:18
@dmetzger57 @NickLaMuro how easy is it to try a "knob" to tweak malloc and run the same scenario on a new appliance?
Nick LaMuro
@NickLaMuro
Nov 29 2017 20:21
probably not that hard, but @dmetzger57 has been setting up appliances for me
Jason Frey
@Fryguy
Nov 29 2017 20:21
the real problem is the 2 day turn-around time :/
Joe Rafaniello
@jrafanie
Nov 29 2017 20:21

I keep coming back to this book: https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_native_memory_fragmentation_and_process_size_growth?lang=en and this comment: https://sourceware.org/bugzilla/show_bug.cgi?id=11261#c12

"There is a solution for people who are stupid enough to create
too many threads. No implementation will be perfect for everyone. The glibc
implementation is tuned for reasonable programs and will run much faster than
any other I tested."

I don't know what the right number is, but I want to say we need to set MALLOC_MMAP_THRESHOLD_=8192 in /etc/default/evm: https://github.com/ManageIQ/manageiq-appliance/blob/eee37f4fe98b8afe00457bcaa2cddd552345a371/LINK/etc/default/evm#L11
I'm thinking we're doing silly things with so many threads and malloc is laughing at us
note, that env variable was previously set incorrectly (without the trailing underscore) before
Oleg Barenboim
@chessbyte
Nov 29 2017 20:24
what are threads for? DRb?
Joe Rafaniello
@jrafanie
Nov 29 2017 20:24
we have a thread for each drb client
(each worker)
But, alas, we won't know until we have the ruby GC logging to know if the ruby view of the heap matches what we're seeing in the OS
Dennis Metzger
@dmetzger57
Nov 29 2017 20:27
@jrafanie when I get back home i'll spin up a 5.9 appliance and tweak MALLOC_MMAP_THRESHOLD for comparison to the current default run.
Joe Rafaniello
@jrafanie
Nov 29 2017 20:27
make sure you put a trailing underscore
Dennis Metzger
@dmetzger57
Nov 29 2017 20:27
:wink:
Joe Rafaniello
@jrafanie
Nov 29 2017 20:28
@dmetzger57 @NickLaMuro I just realized we can get a "little" info about each thread...
# grep -E "^(Pid|Name|State)" /proc/7162/task/*/status | head
/proc/7162/task/10165/status:Name:    drb.rb:1498
/proc/7162/task/10165/status:State:    S (sleeping)
/proc/7162/task/10165/status:Pid:    10165
/proc/7162/task/10190/status:Name:    ruby
/proc/7162/task/10190/status:State:    S (sleeping)
/proc/7162/task/10190/status:Pid:    10190
/proc/7162/task/10198/status:Name:    ruby
/proc/7162/task/10198/status:State:    S (sleeping)
/proc/7162/task/10198/status:Pid:    10198
/proc/7162/task/10208/status:Name:    ruby
on Dennis' appliance we can see one drb server and 11 drb clients
# grep drb.rb /proc/7162/task/*/status
/proc/7162/task/10165/status:Name:    drb.rb:1498
/proc/7162/task/10217/status:Name:    drb.rb:1645
/proc/7162/task/10221/status:Name:    drb.rb:1645
/proc/7162/task/10227/status:Name:    drb.rb:1645
/proc/7162/task/10379/status:Name:    drb.rb:1645
/proc/7162/task/10395/status:Name:    drb.rb:1645
/proc/7162/task/10407/status:Name:    drb.rb:1645
/proc/7162/task/16513/status:Name:    drb.rb:1645
/proc/7162/task/16570/status:Name:    drb.rb:1645
/proc/7162/task/22611/status:Name:    drb.rb:1645
/proc/7162/task/31638/status:Name:    drb.rb:1645
/proc/7162/task/31642/status:Name:    drb.rb:1645
Nick LaMuro
@NickLaMuro
Nov 29 2017 20:30
are there 11 worker processes?
Joe Rafaniello
@jrafanie
Nov 29 2017 20:30
bingo
yes
we have 13 other non drb threads on that process
# grep -E "Name" /proc/7162/task/*/status | grep -v drb
/proc/7162/task/10190/status:Name:    ruby
/proc/7162/task/10198/status:Name:    ruby
/proc/7162/task/10208/status:Name:    ruby
/proc/7162/task/10330/status:Name:    ruby
/proc/7162/task/10338/status:Name:    ruby
/proc/7162/task/10347/status:Name:    ruby
/proc/7162/task/16506/status:Name:    ruby
/proc/7162/task/16554/status:Name:    ruby
/proc/7162/task/22568/status:Name:    ruby
/proc/7162/task/31858/status:Name:    ruby
/proc/7162/task/31861/status:Name:    ruby-timer-thr
/proc/7162/task/31863/status:Name:    ruby
/proc/7162/task/7162/status:Name:    ruby