These are chat archives for ManageIQ/manageiq/performance

16th
May 2016
Alex Krzos
@akrzos
May 16 2016 12:05
Sorry I was on PTO, is there a way to set an away message on gitter?
Alex Krzos
@akrzos
May 16 2016 12:16
@jrafanie with respect to increased memory usage that depends on what you compare 5.6 to, as @dmetzger57 mentioned comparing to 5.5.0.13-2 (First 5.5 release) I am seeing around a ~500MiB increase in memory usage (from an appliance standpoint) just from turning on all roles with the exception of Websockets role (since that spawns another worker and I have no baseline for that worker except with 5.6)
I am putting together a comparison with providers too, focusing on vmware provider for now, also digging into what qe is seeing
I setup monitored appliances on my hypvisors attached to the provider they are having issues with
with that I have not see any swapping on two appliances i setup attached to their provider when it is setup on my hypervisor
current hypothesis is potentially the hypervisor load is contributing to the issue
Joe Rafaniello
@jrafanie
May 16 2016 12:27
It would be helpful to narrow down the 500 mb appliance memory increase to a specific worker comparison... Is the increase across all workers or is one the main culprit?
Alex Krzos
@akrzos
May 16 2016 12:30
Let me put the numbers side by side
Dennis Metzger
@dmetzger57
May 16 2016 12:30
@jrafanie this chart https://gitter.im/ManageIQ/manageiq/performance?at=5735f708eea93e5742d1a440 shows the memory delta (PSS) betwen 5.5.0.13.2 and 5.6.0-6
Alex Krzos
@akrzos
May 16 2016 12:33
It's across the board - all workers except for maybe ui worker (And thats possibly because the setup of the same test in 5.5.0.13 included some ui interactions thus bumping his memory usage during the timeframe measured, since then I use no ui interactions during the measurement of this test)
And note by all workers I mean every worker than an appliance would have when it doesn't have a provider (idle)
Joe Rafaniello
@jrafanie
May 16 2016 12:45
Is this the same number of each worker type? I realize you disabled the new console worker, which is good for comparison
I'd be curious how the lazy loading message catalogs affects your results since it designed to decrease memory in all rails processes
ManageIQ/manageiq#8525
Alex Krzos
@akrzos
May 16 2016 12:51
End of Test Values 5.6.0.6 5.5.0.13-2 Difference
MiqGenericWorker 201.66 176.53 25.13
MiqGenericWorker 213.3 217.45 -4.15
MiqPriorityWorker 200.36 157.45 42.91
MiqPriorityWorker 201.19 158.74 42.45
MiqScheduleWorker 194.7 169.78 24.92
MiqUiWorker 198.55 246.37 -47.82
MiqWebServiceWorker 189.06 142.85 46.21
MiqWebsocketWorker 0 0 0
MiqReplicationWorker 194.23 140.11 54.12
MiqEventHandler 181.87 138.93 42.94
MiqEmsMetricsProcessorWorker 188.55 136.4 52.15
MiqEmsMetricsProcessorWorker 188.27 136.49 51.78
MiqReportingWorker 177.76 136.35 41.41
MiqReportingWorker 177.88 136.36 41.52
MiqSmartProxyWorker 181.51 140.73 40.78
MiqSmartProxyWorker 181.58 138.75 42.83
evm_server.rb 226.75 191.34 35.41
So Depending on which genericworker picked up more work, we can see a difference between the two
(PSS Memory fyi)
reporting workers are a great example here
because they did nothing but costed 40MiB more each
Jason Frey
@Fryguy
May 16 2016 14:11
I think this PR is a Big Deal for Ems Refresh worker performance: ManageIQ/manageiq#8668
Basically, before, we only kept the id in memory, but now we keep the whole object in memory. However, the object that we keep also has all f the associated objects as well, so the memory bloat can be pretty large
Keenan Brock
@kbrock
May 16 2016 17:28
using subqueries in active record
==> http://codesnik.github.io/rails/2015/09/03/activerecord-and-exists-subqueries.html
(cleaning out browser tabs)
Keenan Brock
@kbrock
May 16 2016 18:09
Do we use codeclimate that much now?
interesting plugin: https://www.producthunt.com/tech/code-climate-for-google-chrome
Jason Frey
@Fryguy
May 16 2016 18:10
yes, we use it
not sure who looks at it besides me and @jrafanie though :(
Chris Arcand
@chrisarcand
May 16 2016 18:10
:hand:
Jason Frey
@Fryguy
May 16 2016 18:11
aaaaand I signed up
those changes look KILLER
Alex Krzos
@akrzos
May 16 2016 18:23
I have a feeling I should never see this:
[root@CF-QE-R0000-DB-5606-QERHOS6 ~]# ps afx
  PID TTY      STAT   TIME COMMAND
    2 ?        S      0:00 [kthreadd]
....
27968 ?        Rl     0:36 MIQ Server
28202 ?        Sl     0:00  \_ MIQ: MiqEventHandler id: 1, queue: ems
28207 ?        Sl     0:00  \_ MIQ: MiqGenericWorker id: 2, queue: generic
28212 ?        Sl     0:00  \_ MIQ: MiqGenericWorker id: 3, queue: generic
28219 ?        Sl     0:01  \_ MIQ: MiqPriorityWorker id: 4, queue: generic
28224 ?        Rl     0:00  \_ MIQ: MiqPriorityWorker id: 5, queue: generic
28230 ?        Sl     0:00  \_ MIQ: MiqReportingWorker id: 6, queue: reporting
28237 ?        Rl     0:00  \_ MIQ: MiqReportingWorker id: 7, queue: reporting
28244 ?        Sl     0:00  \_ MIQ Server
MIQ Server spawned under a MIQ Server process
Jason Frey
@Fryguy
May 16 2016 18:34
that could be related to the issue that @jrafanie just fixed
are you running with his changes?
Alex Krzos
@akrzos
May 16 2016 19:27
without his changes
Jason Frey
@Fryguy
May 16 2016 19:31
OK...it's interesting that one MIQ Server is the child of the other
Alex Krzos
@akrzos
May 16 2016 19:33
It didn't stick around
I'm not sure if it renamed itself or died, I'd theoryize that it somehow renamed itself that before renaming itself yet again later
it did
looking back in my console output
that pid became the miqschedule worker
maybe i typed ps afx right as it forked and it had the same name
Jason Frey
@Fryguy
May 16 2016 19:50
oh wild...yeah technically it will inherit the proctitle until it changes it
I wonder what the time delay is....I can see how that would be confusing