These are chat archives for ManageIQ/manageiq/performance

5th
Oct 2015
Alex Krzos
@akrzos
Oct 05 2015 13:05
@jrafanie So tons of data points here, google sheets doesn't like how many rows of data so won't even let me import the csv, but looks like every worker has a higher amount of memory on 5.5 vs the 5.4 run, also going to start a new capture immediately after restarting evmserverd to see how it looks when it first starts
Joe Rafaniello
@jrafanie
Oct 05 2015 13:53
ok, @akrzos, that's the expectation with 2.2, how much more per worker type would be useful to know... Do you have a way to measure if the workers are more productive? More queue items processed, ems refresh durations over time, etc.
Alex Krzos
@akrzos
Oct 05 2015 14:15
OK So sorting the data on the long term test is turning into a huge pain so sorting out this shorter version of the same test
20minutes worth of data here
All the refresh workers graphed
in KiB
Alex Krzos
@akrzos
Oct 05 2015 14:26
So in this environment (Medium sized VMware provider - 1k VMs) - RefreshWorker 115MiB, Broker 52MiB, RefreshCore 102MiB, GenericWorker ~80MiB x 2, PriorityWorker ~50MiB x 2
Add that with two automate workers
ScheduleWorker as well
let me get the rest of the workers
but this probably adds up to above a GiB more for the same environment
Alex Krzos
@akrzos
Oct 05 2015 14:59
So that tallys to a total of 1,217MiB more to manage the same sized provider with the same workload
Alex Krzos
@akrzos
Oct 05 2015 15:36
So I can turn on C&U to that test and see how much more memory we use with C&U workers too
but I'm sure I'm going to have to bump memory on that 5.5 appliance and watch for it swapping
Greg McCullough
@gmcculloug
Oct 05 2015 16:25
The automate workers starting without the role enabled sounds like it needs a BZ. that was not intentional
Alex Krzos
@akrzos
Oct 05 2015 16:25
@gmcculloug got it, I'll open one
Keenan Brock
@kbrock
Oct 05 2015 16:25
I'm looking at this bz I'm reading that the memory usage increase is due to increased size requirements of the provider objects
since the base case with small providers takes up the same amount of memory
doesn't look like gems. looks like either running the refresh is taking up more memory (allocating more objects) or the objects themselves are larger
Joe Rafaniello
@jrafanie
Oct 05 2015 16:27
@akrzos So, my first thought is that perhaps the garbage collector environment variables need to be tweaked to start with a higher initial memory footprint but grow slower than it does now
it seems like 2.0 -> 2.2 in high object churn code such as ems refresh, we grow the memory a lot and it never gets returned to the OS even if subsequent full GCs clear up enough space
a side option is looking again at copy on write friendly forking workers with eager loading of our files
Keenan Brock
@kbrock
Oct 05 2015 16:38
@jrafanie but we are not saying that the gems are causing the issue, but rather the amount of memory used up durring the refresh?
Alex Krzos
@akrzos
Oct 05 2015 18:15
@gmcculloug doh well, can't seem to reproduce what I had in my environment with the automate workers not stopping after I restarted evmserverd
Joe Rafaniello
@jrafanie
Oct 05 2015 18:20
@kbrock I'm fairly confident that gems aren't increasing memory usage by orders of magnitude
Alex Krzos
@akrzos
Oct 05 2015 18:28
@kbrock I added 5.3 to the rails console memory size test as well https://ethercalc.org/lq2e7gizmf The averages are 5.3 ~75MiB, 5.4 - 113MiB, and 5.5 166MiB
Keenan Brock
@kbrock
Oct 05 2015 18:29
thnx
Alex Krzos
@akrzos
Oct 05 2015 18:29
so that is fairly significant growth in amount of memory required just to run something in the rails console between those releases
Joe Rafaniello
@jrafanie
Oct 05 2015 18:31
@akrzos we also added more providers with each release so I'd imagine that's where some of that memory growth is coming from, new or updated gems
Dennis Metzger
@dmetzger57
Oct 05 2015 18:36
@akrzos we're wondering how 5.5 performance (say EMS refresh initially) looks compared to 5.4 when you have enough vRAM configured to ensure there's no swapping. I think data for the large environment would be great to have. Not sure if you have data from a 5.5 Large Env run where there was plenty of vRAM available or not.
Alex Krzos
@akrzos
Oct 05 2015 18:38
@dmetzger57 The timings are great for the large environment we are at 462s in the 99%ile for an inital refresh comapred to 5.4 at 719s for 99%ile
I have some 5.3 results as well, let me dig those up
1007s on 5.3.4.2 in 99%ile
Jason Frey
@Fryguy
Oct 05 2015 18:41
I would really love if the memory could be shown in some sort of treemap
might be nice to visualize relative changes
Alex Krzos
@akrzos
Oct 05 2015 18:42
That is nice looking
Jason Frey
@Fryguy
Oct 05 2015 18:42
I've also found that format really useful for disk space analysis
Alex Krzos
@akrzos
Oct 05 2015 18:42
oh yeah disk inventory x
used that before to find out whats taking most space on my hdd
Jason Frey
@Fryguy
Oct 05 2015 18:43
I can't remember what the original was called on Windows, but it had a similar name
Alex Krzos
@akrzos
Oct 05 2015 18:43
I captured the same memory metrics with C&U on now too, time to start seeing what the collectors and processors are growing at
Jason Frey
@Fryguy
Oct 05 2015 18:44
cool
Dennis Metzger
@dmetzger57
Oct 05 2015 18:46
@akrzos the 462s is with the refresh patch correct? Assuming that was with the patch, do you have the time for 5.5 without the patch?
Alex Krzos
@akrzos
Oct 05 2015 18:51
So patched
Joe Rafaniello
@jrafanie
Oct 05 2015 18:51
@dmetzger57 by patch, do you mean the batch connection optimization? ManageIQ/manageiq#4073
Alex Krzos
@akrzos
Oct 05 2015 18:51
I only tested on master
I have between 334s-366s on patched on vmware-large
Dennis Metzger
@dmetzger57
Oct 05 2015 18:51
@jrafanie yes
Joe Rafaniello
@jrafanie
Oct 05 2015 18:52
thanks @dmetzger57 ;-)
Dennis Metzger
@dmetzger57
Oct 05 2015 18:55
ok, so 5.5 (including the batch connections change) is about about 50% faster than 5.4. How much more vRAM did you have to give the 5.5 system - looking for the speed VS memory relationship
Alex Krzos
@akrzos
Oct 05 2015 18:57
@dmetzger57 So to run all the benchmarks I just bump vRAM to 16384MiB to make sure there is plenty there
I believe I could get away with small/medium/large providers with the default of 6144MiB
xlarge just has a problem with the memory because the VIMBroker exceeds the default memory threshold so I also adjust that up in the benchmark
Jason Frey
@Fryguy
Oct 05 2015 19:39
I've decided to punt on the delay load of the fast-gettext stuff until @jrafanie's forking workers comes together more
Because once we move to that, we'll want to eager load more
and then delay-loading fast-gettext makes less sense
though we may choose to delay load by default, and eager load just for UI workers
Joe Rafaniello
@jrafanie
Oct 05 2015 19:41
Just rebased locally... for the first time since July
Oleg Barenboim
@chessbyte
Oct 05 2015 19:41
@Fryguy it makes sense to delay load the multiple languages, since one installed base will typically be using a limited number of languages
@Fryguy will be especially true as we move from English and Japanese to 5, 10, 20, ... supported languages
Jason Frey
@Fryguy
Oct 05 2015 19:42
agreed, but if you do that in a forking workers scenario you end up using more memory
or rather, I'd prefer to benchmark it with forking workers to be sure
Oleg Barenboim
@chessbyte
Oct 05 2015 19:43
how much memory is each extra language using roughly?
Jason Frey
@Fryguy
Oct 05 2015 19:43
not sure offhand
if the server loads 20 languages, then forks, you get no penalty
Oleg Barenboim
@chessbyte
Oct 05 2015 19:43
you get the penalty if your installed base is using only 2 of those languages
Jason Frey
@Fryguy
Oct 05 2015 19:43
if the server loads 0 languages and forks 25 times, you pay a 5+ penalty
but you probably don't want to load it for everything if it's not needed
like, say, a broker
much like we don't want to load OpenStack if all the user has is vmware
Oleg Barenboim
@chessbyte
Oct 05 2015 19:44
plus, I believe only the UI, web service worker and reporting worker use the languages
Jason Frey
@Fryguy
Oct 05 2015 19:44
yeah
Oleg Barenboim
@chessbyte
Oct 05 2015 19:44
cannot imagine any provider or other worker needing the languages
Jason Frey
@Fryguy
Oct 05 2015 19:44
so, one idea to to eager load things for the UI worker right before you launch a UI worker
so in that case we will need the delay, but I think it's better to wait until then and see what our requirements are around it
but we can also delay by not putting it in an initializer
Oleg Barenboim
@chessbyte
Oct 05 2015 19:49
not initialize and leverage memcache?
read the email that I just forwarded from DaJo
Jason Frey
@Fryguy
Oct 05 2015 19:59
yeah
I don't think the translation stuff is contributing to that (should be relatively the same in 5.4 to 5.5)
I think translations was something like 6 MB of memory anyway
Oleg Barenboim
@chessbyte
Oct 05 2015 20:00
that sounds like peanuts
Jason Frey
@Fryguy
Oct 05 2015 20:00
@jrafanie you had the numbers, right?
Dan Clarizio
@dclarizio
Oct 05 2015 20:02
not sure what all gets loaded with it, but the .po file for ja is 350K
Joe Rafaniello
@jrafanie
Oct 05 2015 20:04
delete the po file(s) locally and run rails console, measure before and after... I can do it, but I'm on the forking worker branch, rebasing/testing
Jason Frey
@Fryguy
Oct 05 2015 20:05
yeah, I thought had done it
Joe Rafaniello
@jrafanie
Oct 05 2015 21:19
@Fryguy rebased forking workers and verified it still works... without any preloading OSX reports 50+ MB of shared memory per process: ManageIQ/manageiq#3593
I'll work on the preloading/eager loading in a subsequent PR