These are chat archives for ManageIQ/manageiq/performance

12th
Oct 2017
Prasad Mukhedkar
@eprasad
Oct 12 2017 11:46
can someone tell me what is PSS
PSS stands for what?
Dennis Metzger
@dmetzger57
Oct 12 2017 11:47
Proportional Set Size
It’s the amount of memory a process is using, taking into account shared pages
Prasad Mukhedkar
@eprasad
Oct 12 2017 11:49
@dmetzger57 thanks .. and how it is calculated .. I have 3 appliances in a zone that connects to my vmware provider
Dennis Metzger
@dmetzger57
Oct 12 2017 11:53
PSS is a per appliance , the system computes the PSS as the memory used soley by the process, plus a percentage of the memory which is shared by multiple processes.
Prasad Mukhedkar
@eprasad
Oct 12 2017 11:56
@dmetzger57 thanks. got it!
Dennis Metzger
@dmetzger57
Oct 12 2017 11:56
are you running into memory issues on your appliances?
Prasad Mukhedkar
@eprasad
Oct 12 2017 11:58
yes, issue reported on bz1479356
Dennis Metzger
@dmetzger57
Oct 12 2017 12:00
do you have the PSS size for the EVM server? I suspect it may be very large, resulting in the forked workers starting with a large PSS footprint - meaning any leak impact will be greater
Prasad Mukhedkar
@eprasad
Oct 12 2017 12:01
I am checking that, we got the logs, reviewing them to findout the PSS
we shall update the bz with the findings
niroy
@niroy
Oct 12 2017 12:08
@dmetzger57 I have asked customer to run this command to get PSS size
ps axo user,pcpu,pmem,rss --no-heading | awk '{pCPU[$1]+=$2; pMEM[$1]+=$3; sRSS[$1]+=$4} END {for (user in pCPU) if (pCPU[user]>0 || sRSS[user]>10240) printf "%s:@%.1f%% of total CPU,@%.1f%% of total MEM@(%.2f GiB used)\n", user, pCPU[user], pMEM[user], sRSS[user]/1024/1024}' | column -ts@ | sort -rnk2
Dennis Metzger
@dmetzger57
Oct 12 2017 12:12
you can run the smem command on the appliance to get PSS values for all running process, probably the simplest was to go
niroy
@niroy
Oct 12 2017 12:13
@dmetzger57 thanks, will advise the same to the Cu
Joe Rafaniello
@jrafanie
Oct 12 2017 12:20
Yes. Smem is far easier to use. I think you can put -p or -P MIQ to only show MIQ processes
I haven't looked at the bz but if the server process is greater than 1 gb resident in top, we've seen but unable recreate issues where the forked workers exceed pss limits before even doing work
Dennis Metzger
@dmetzger57
Oct 12 2017 12:24
this case is specifically Generic worker exceeding it’s threshold, but a comment implies the workers are starting with a large PSS value, hence my podering about the EVM Server
Joe Rafaniello
@jrafanie
Oct 12 2017 12:32
Yeah. Top output log and evm log should be enough. It would be great to have historical ones to figure out why the server process grew
I can help look in a bit after school drop off
In all prior cases, they had to reboot so capture logs first
Joe Rafaniello
@jrafanie
Oct 12 2017 13:22
@niroy @dmetzger57 this is the prior issue where a "bloated" > 1 GB server process was forking child processes that exceeded the PSS limits before doing much work: https://bugzilla.redhat.com/show_bug.cgi?id=1425217#c20
It's important we get the top_output.log for the RSS of the server and child processes
niroy
@niroy
Oct 12 2017 13:25
@jrafanie I have requested for the archived logs to the customer, you have the top_output.log in the log directory, will that be enough ?
Joe Rafaniello
@jrafanie
Oct 12 2017 13:25
yeah, historical .gz files for both evm.log and top_output.log (or all logs) will be great
If the RSS of the server is > 1 GB, we need to figure out what it's doing to get so large
Keenan Brock
@kbrock
Oct 12 2017 13:26
@jrafanie Do we have a util to read the current RSS of the current process?
is it expensive?
Joe Rafaniello
@jrafanie
Oct 12 2017 13:27
yeah, sysproctable has it
we log both
memory size, memory usage, and proportional set size
I always forget which is VSS and which is RSS but the last is clearly PSS
I believe size == VSS, usage == RSS, and PSS is obvious
Here's the part I think that's important for that linked prior case (with no fix or resolution due to inability to recreate it reliably)

"In the provided top output logs, the server process is ~1.2 GB and each forked worker starts at roughly 1.2 GB RSS. Note, we use PSS private + (shared memory / processes sharing this memory) for threshold checking, but if ruby's poor GC is in play, perhaps this is causing workers start at near or above the normal worker threshold.

Perhaps, if the server process is already at ~1.2 GB RSS and each forked child is starting at ~1.2 GB of RSS, maybe it's more likely that a forked child will allocate/free objects in the OS heap "next" to other objects in the heap, causing more of the process memory to be copied vs. a process that starts at the typical 200-250 MB RSS. It still doesn't seem likely that this would happen nearly immediately. "

then more :book: material that NickL would appreciate...

"So, our appliance memory threshold code is checking PSS (taking into account shared memory), not RSS, so just because the server PSS is greater than the worker threshold, the forked worker will start at the same RSS, but the PSS should be significantly smaller since it's sharing the memory with the server.
It should not should not immediately start greater than the threshold.

Next is only my best guess. I have no evidence.
What I'm suggesting is that either a combination of OS swapping or ruby's not copy-on-write friendly GC is causing the OS memory pages shared with the server process to be written to, either from freeing or allocating memory for objects on the same OS heap page as the shared memory. I'm not exactly sure how shared memory pages work when your OS memory is low and you need to swap out and swap back. My guess is that shared memory that's swapped out will become private memory for each process when swapped back in. Since the process of scanning vms seems to balloon the process to about 600 MB, perhaps it's more likely that these allocations occur on OS heap pages that also contain memory shared with the server. Again, the test report from comment 22 and 25 don't confirm or reject this since the server process was still relatively small and under the 600 MB threshold."

Joe Rafaniello
@jrafanie
Oct 12 2017 13:34
tl;dr: If top_output.log shows the server process and a worker that "exceeds our memory thresholds immediately" to be greater than 1 GB RSS (what top shows), we need to track the top_output.log and evm.log to figure out what the server was doing and how the child process can be created and nearly immediately write to shared memory, causing CoW pages to be copied to the point of exceeding memory thresholds
Jason Frey
@Fryguy
Oct 12 2017 13:53
need a shorter tl;dr haha
I think you are saying that we need to figure out how the server gets to 1GB in the first place
cause it's a guarantee that any child of a 1GB+ process will also become 1GB+ in a short time because of how Ruby deals with CoW
Dennis Metzger
@dmetzger57
Oct 12 2017 13:54
step #1, verify the size of EVM Server :smile:
Jason Frey
@Fryguy
Oct 12 2017 13:54
@jrafanie I'm happy to hop on a call/bluejeans with you to help figure this out if you need more eyes
this isn't the first time we've seen evm servers getting crazy high numbers
so, yes, we should verify, but it sounds reasonable to be the issue
Joe Rafaniello
@jrafanie
Oct 12 2017 13:56

cause it's a guarantee that any child of a 1GB+ process will also become 1GB+ in a short time because of how Ruby deals with CoW

@Fryguy that's just an assumption. It seems absurd that it writes/frees to enough shared pages to already exceed PSS limits before doing actual work

Jason Frey
@Fryguy
Oct 12 2017 13:56
I don't think it's an assumption, knowing how Ruby's GC works
Joe Rafaniello
@jrafanie
Oct 12 2017 13:56
But yes, the only constant with these reported issues is a bloated server process
Jason Frey
@Fryguy
Oct 12 2017 13:56
it only takes one GC cycle to copy all of the memory
Joe Rafaniello
@jrafanie
Oct 12 2017 13:57
We should be using the safe fork that GCs 3 times before each fork so the shared memory should have already marked things as old
Jason Frey
@Fryguy
Oct 12 2017 13:58
but it's not that it touches the old bit, it's that it touches the other stuff in the pages
Joe Rafaniello
@jrafanie
Oct 12 2017 13:58
yeah, that's the part I'm speculating about
It makes sense but still seems hard to believe
The other thought was that if the system invades swap, shared pages could be swapped out and swapped back in as private pages for each process
But either way, a smaller server process doesn't seem to exhibit this behavior so finding the cause of the server memory growth seems to be the easiest way to fix it
Jason Frey
@Fryguy
Oct 12 2017 14:01
but overall, it doesn't really matter why, right? We know that a child process forked from a 1GB process will quickly hit 1GB
so, we just have to understand how the server hits 1GB
Joe Rafaniello
@jrafanie
Oct 12 2017 14:03
Yeah, the why isn't important. I haven't actually seen logs of a server process that's normal size (and everything is fine) suddenly grow to > 1 GB and workers start to immediately exceed memory. The ones I've seen only show the "after" where the server is already huge.
Jason Frey
@Fryguy
Oct 12 2017 14:04
yeah, that's the key...and we haven't been able to reproduce, so we need logs from before the damage
Joe Rafaniello
@jrafanie
Oct 12 2017 14:04
yes
I wonder if provider workers sync_workers calls are connecting to remote systems and could leak memory if there's some exception occuring
Jason Frey
@Fryguy
Oct 12 2017 14:04
My gut is telling me we have some kind of thread leak, but I can't figure out how to prove it :/
that's an interesting idea
Joe Rafaniello
@jrafanie
Oct 12 2017 14:04
yeah, DRb threads are my immediate thought
Keenan Brock
@kbrock
Oct 12 2017 14:05
is MiqProcess.processInfo too expensive to call in every MiqQueue#delivered?
Jason Frey
@Fryguy
Oct 12 2017 14:05
yeah...DRb
Keenan Brock
@kbrock
Oct 12 2017 14:05
think DRb is on everyone's mind
Jason Frey
@Fryguy
Oct 12 2017 14:05
no...or I've never seen it affect anything before
MiqProcess.processInfo is a glorified shell-out, then post-process
Keenan Brock
@kbrock
Oct 12 2017 14:06
hmm
Joe Rafaniello
@jrafanie
Oct 12 2017 14:06
@kbrock if it's the server process, that won't help
Jason Frey
@Fryguy
Oct 12 2017 14:06
actually, it's not even a shell-out...it just reads /proc/##/stat
Joe Rafaniello
@jrafanie
Oct 12 2017 14:06
for queue workers, that might be helpful
Keenan Brock
@kbrock
Oct 12 2017 14:06
yes, MiqQueue.delivered on queue workers
hmm
ok, will see if I can turn that off or something
Joe Rafaniello
@jrafanie
Oct 12 2017 14:07
To your point @kbrock, the server could log memory each iteration through the monitor loop
we already benchmark timings
Keenan Brock
@kbrock
Oct 12 2017 14:07
well, most monitoring tools tell you the memory per http / queue message, not per second
I want to track down a bad player in Generic Worker (not just the long term memory leak)
Jason Frey
@Fryguy
Oct 12 2017 14:11
@kbrock Are you talking about the same issue we are talking about, or is this something different?
Keenan Brock
@kbrock
Oct 12 2017 14:12
https://bugzilla.redhat.com/show_bug.cgi?id=1479356 -- tacked on. GenericWorker killing off every few minutes
Jason Frey
@Fryguy
Oct 12 2017 14:12
(we were talking about 1GB evm server size)
ah, ok, so different topic
Keenan Brock
@kbrock
Oct 12 2017 14:12
srry
Jason Frey
@Fryguy
Oct 12 2017 14:12
I was getting confused :)
no prob, just don't want to conflate the two conversations
oh, wait, that's the same BZ
Dennis Metzger
@dmetzger57
Oct 12 2017 14:15
the cf-prio issue was “added” to the BZ, conflating the two. the BZ is for a slow leak, the cf-prio issue in not the slow leak
Jason Frey
@Fryguy
Oct 12 2017 14:15
BTW, I can reproduce Ruby copying all of the memory from the parent process in a short script:
even with nakayoshi_fork
require 'nakayoshi_fork'

x = 15_000_000.times.map { "a" }
puts "Forking..."

pid = fork(:cow_friendly => true)
if pid.nil? # Child
  GC.start
  sleep 60
else # Parent
  puts "Press Enter..."
  gets
end
then just watch both ruby processes in top or Activity Monitor or whatever
Keenan Brock
@kbrock
Oct 12 2017 14:17
@jrafanie ok, so you don't think it is an issue that a message (in generic worker) is bad, but rather once GC runs, pss (under limit) approaches rss (already over limit) and dies?
Dennis Metzger
@dmetzger57
Oct 12 2017 14:18
we need to see the EVM Server size to know which path
Jason Frey
@Fryguy
Oct 12 2017 14:18
that is the thought...if we see a log that shows the evm server is already too big
Joe Rafaniello
@jrafanie
Oct 12 2017 14:18
What @dmetzger57 said ^
Jason Frey
@Fryguy
Oct 12 2017 14:18
yeah what @dmetzger57 said haha
Joe Rafaniello
@jrafanie
Oct 12 2017 14:19
gitter needs :+1: on messages
Keenan Brock
@kbrock
Oct 12 2017 14:21
@jrafanie monitor is only called from EvmServer. Not called from the worker children. right?
So monitor is to EvmServer what heartbeat is to MiqQueueWorker::Runner ?
Joe Rafaniello
@jrafanie
Oct 12 2017 14:22
well, monitor and heartbeat are both run in a loop
the server's loop clearly does more things
Keenan Brock
@kbrock
Oct 12 2017 14:25
ok, it is easy to find \bhearbeat\b in loops in Runner code and MiqServer
But harder to find where \bmonitor\b is called
Keenan Brock
@kbrock
Oct 12 2017 19:26
@dmetzger57 do you have samples of evm.log feeding into gnuplot?
Dennis Metzger
@dmetzger57
Oct 12 2017 19:34
here’s a quick and dirty ruby script i use to plot the CPU data (free) scraped from top_output.log http://pastebin.test.redhat.com/524199
Jason Frey
@Fryguy
Oct 12 2017 20:17
that's quick and dirty? :trollface:
@dmetzger57 FWIW, here's a little function that creates charts using gruff gem: https://github.com/ManageIQ/manageiq-gems-pending/blob/master/lib/gems/pending/util/miq_logger_processor.rb#L20-L92
Keenan Brock
@kbrock
Oct 12 2017 20:49
thnx
I'm parsing miq log in a generic way since the form tends to be name: [###],
Keenan Brock
@kbrock
Oct 12 2017 23:11

@dmetzger57 ok, so the generic worker processing c&u (minor spike at @0:50 hrs, big spike @2:13 hrs)
Consistent with our original signs
There are other EmsCluster.perf_rollup_range that tend to take 1/2 second. But at those markers it takes > 3 minutes

before: PSS: [322,626,000] RSS: [421,384,192] Live Objects [2,271,888] Old Objects: [1,465,459]
after:  PSS: [477,910,000] RSS: [580,317,184] Live Objects [3,560,766] Old Objects: [2,318,837]

The values for "old objects" seem to go up and down. So I don't think our issue is ruby caching.
But all of a sudden, the number of live objects spikes, rss has a big jump.
And it is never the same after that

plot-0-5hrs.png
blue star (live objects) and orange box (old objects) are kind all over the place. Since the old keeps hitting lows, that is why I'm saying this is not a caching issue / retained objects issue
but the purple plus (pss) and blue x (rss) sure take a big spike at 2:15. It could be the rollup, or heck, it could just be a coincodince.
plot-2hrs.png
Keenan Brock
@kbrock
Oct 12 2017 23:17
^ that graph is zoomed into the spike. You'll notice the ems cluster perf_rollup_range job that takes a long time to complete there.
That could be a major GC cycle or something.