These are chat archives for ManageIQ/manageiq/performance

7th
Jan 2016
Joe Rafaniello
@jrafanie
Jan 07 2016 00:30
@akrzos it's all your fault: ManageIQ/linux_admin#149
it seems like a pain to install smem when it's all there in /proc/pid/smaps
ok, that is all for today, good night
:smile:
Joe Rafaniello
@jrafanie
Jan 07 2016 15:04
hey @akrzos, would you be able to get an appliance or have one that can invade swap? I wonder what PSS, RSS, USS, shows in terms of a process that has swapped some memory
Alex Krzos
@akrzos
Jan 07 2016 15:05
@jrafanie I can set one up quickly for you
Joe Rafaniello
@jrafanie
Jan 07 2016 15:05
thanks!
my ruby code is a single file so I'd like to compare smem to that also
I'm curious what's the best way to measure a process that's consuming memory if PSS, RSS, USS doesn't show swap properly
Alex Krzos
@akrzos
Jan 07 2016 15:08
I think we just need to do PSS + SWAP
Joe Rafaniello
@jrafanie
Jan 07 2016 15:08
yeah, that's what I want to confirm
Alex Krzos
@akrzos
Jan 07 2016 15:08
I ran smem with an appliance that would swap over the break, let me see if I saved those results thought I doubt I did
Joe Rafaniello
@jrafanie
Jan 07 2016 15:08
I'm hoping that /proc/$pid/smaps has all that we need
Alex Krzos
@akrzos
Jan 07 2016 15:09
On another interesting note
just noticed the ui reports memory threshold for c&u collectors as 200MiB
yet I'm definitely way above that
Joe Rafaniello
@jrafanie
Jan 07 2016 15:12
I would love to remove the stuff we have in MiqProcess and use the smem type data from LinuxAdmin + sys-proctable since @djberg96 already did much of that work already
Keenan Brock
@kbrock
Jan 07 2016 15:15
@akrzos so you are noticing c&u down - right?
Jason Frey
@Fryguy
Jan 07 2016 15:27
@jrafanie I wonder if your smem stuff should go into sys-proctable
and then linux_admin just exposes that stuff
Daniel Berger
@djberg96
Jan 07 2016 15:28
I could add it
Jason Frey
@Fryguy
Jan 07 2016 15:28
See @jrafanie PR on linux_admin
Daniel Berger
@djberg96
Jan 07 2016 15:28
Or someone could send me a patch :)
Jason Frey
@Fryguy
Jan 07 2016 15:28
:D
Daniel Berger
@djberg96
Jan 07 2016 15:28
oh, that reminds me
one person is seeing a problem on OSX with a UserEventAgent process for some reason, but I can't duplicate it
Jason Frey
@Fryguy
Jan 07 2016 15:29
on ManageIQ or on sys-proctable?
Daniel Berger
@djberg96
Jan 07 2016 15:29
if any of you see it, please let me know: djberg96/sys-proctable#44
(er, scroll down a bit)
@Fryguy sys-proctable
Daniel Berger
@djberg96
Jan 07 2016 15:37
i couldn't figure out how to link to a specific comment :(
Keenan Brock
@kbrock
Jan 07 2016 15:37
@djberg96 click on the time
seems to be a common pattern on the interwebs
Daniel Berger
@djberg96
Jan 07 2016 15:38
aha, thanks
Joe Rafaniello
@jrafanie
Jan 07 2016 15:48
@djberg96 I'd be fine with putting the PSS, RSS, USS, stuff from the linux admin PR in sys-proctable but I'm not pretending to want to figure out how to do this on the other platforms that sys-proctable works on
which is why I put it in linux admin
Daniel Berger
@djberg96
Jan 07 2016 15:52
@jrafanie there are only a few fields I try to be consistent with across platforms
I have no problem adding more stuff for Linux
The assumption with that library is that you know what you're looking for on your platform
basically, stuff like comm, pid and ppid - but that's already set, so no worries
Joe Rafaniello
@jrafanie
Jan 07 2016 15:53
ok, i'll see about adding it to sys-proctable then
I need better names for them though
Daniel Berger
@djberg96
Jan 07 2016 15:54
i think rss is already there, unless you need something special
Joe Rafaniello
@jrafanie
Jan 07 2016 15:54
any suggestions on pss, uss, swap, size, etc. ? https://github.com/ManageIQ/linux_admin/pull/149/files#r49029607
Daniel Berger
@djberg96
Jan 07 2016 15:58
not familiar with pss or uss, for swap i have both nswap and cnswap, so would need clarification
is size the same as vsize? if so, already there, too
oh, i see, you're parsing all of that data out of smaps
might be worth adding a bonafide object for it then, like we did for cgroupentry
Daniel Berger
@djberg96
Jan 07 2016 16:03
so, process.smap.rss, and so forth
Daniel Berger
@djberg96
Jan 07 2016 16:40
hm, seems reading smaps requires superuser privs
Joe Rafaniello
@jrafanie
Jan 07 2016 16:45
@djberg96 is that a problem or something we need to check for?
Daniel Berger
@djberg96
Jan 07 2016 16:45
would mean a lot of rescued exceptions
Joe Rafaniello
@jrafanie
Jan 07 2016 16:45
is this the first thing that sys-proctable needs superuser privs?
Daniel Berger
@djberg96
Jan 07 2016 16:46
not normally, no
my initial instinct is to skip it unless Process.euid == 0
Joe Rafaniello
@jrafanie
Jan 07 2016 16:46
ok
@all ^
Keenan Brock
@kbrock
Jan 07 2016 18:04
thanks
+1
Alex Krzos
@akrzos
Jan 07 2016 19:02
Thanks Guys!
Joe Rafaniello
@jrafanie
Jan 07 2016 20:46
@akrzos I finally played with smem on your appliance that was swapping, looks like we should be checking PSS + swap, our existing RSS checking will not protect us if it's mostly swapped out
[root@dhcp23-33 ~]# smem -r
  PID User     Command                         Swap      USS      PSS      RSS
24847 root     /var/www/miq/vmdb/lib/worke  1154504  1788832  1788927  1790180
16359 root     /var/www/miq/vmdb/lib/worke   737896  1449852  1449992  1451592
 1254 root     /var/www/miq/vmdb/lib/worke    15228   272508   273173   275080
12454 root     /var/www/miq/vmdb/lib/worke   146012   260580   261245   263148
12370 root     /var/www/miq/vmdb/lib/worke    86724   207828   208043   209740
39408 root     /var/www/miq/vmdb/lib/worke    64904   196028   196158   197636
12584 root     /var/www/miq/vmdb/lib/worke   260916   168452   168546   169884
  818 root     /var/www/miq/vmdb/lib/worke    14080   150544   150625   151824
12593 root     /var/www/miq/vmdb/lib/worke   149052   133032   133198   134892
12587 root     /var/www/miq/vmdb/lib/worke   286296   116548   116754   118456
12590 root     /var/www/miq/vmdb/lib/worke   164920   115328   115474   117092
12581 root     /var/www/miq/vmdb/lib/worke    66628   104980   105038   106028
12596 root     /var/www/miq/vmdb/lib/worke   131524    33704    33778    34912
16385 postgres postgres: root vmdb_product     3036    14744    18050    26604
12376 postgres postgres: root vmdb_product     7056    12308    13184    17084
12647 postgres postgres: root vmdb_product    28904     7116    10293    18956
12649 postgres postgres: root vmdb_product    25244     6392     9412    17292
32403 root     python /usr/bin/smem -r            0     7612     7860     8624
  940 ovirtagent /usr/bin/python /usr/share/    11088     7436     7668     8392
12623 postgres postgres: root vmdb_product    10324     6100     6642     9956
12626 postgres postgres: root vmdb_product    10456     6032     6607     9968
16493 root     ruby /var/www/miq/vmdb/bin/   183656     3716     3788     4660
40264 root     ruby /var/www/miq/vmdb/bin/   183840     3388     3461     4340
39541 postgres postgres: root vmdb_product     7768     2072     2582     5732
12057 postgres postgres: checkpointer proc     1076     1100     1374     2580
24863 postgres postgres: root vmdb_product     1912     1052     1263     2732
    1 root     /usr/lib/systemd/systemd --     2340      932      942     1204
  884 root     /usr/bin/python -Es /usr/sb    16144      868      910     1236
pid 12584 looks to be over 420 MB Swap + RSS but RSS is only reporting ~170 MB
Can you confirm the memory thresholds fail to kill those workers when we're in swap?
Alex Krzos
@akrzos
Jan 07 2016 22:32
@jrafanie I can run some of my scenarios with my largest provider thus we have the highest memory consumption on an appliance that is memory constrained and we can review the output that smem logs
I do think this is probably an "edge" case though since if your in swap, your pretty much in a lot of trouble already
Bad news, I have an 8hour run of 5.5.2.0 and I'm seeing significantly more memory used
Alex Krzos
@akrzos
Jan 07 2016 22:37
~700MiB at the appliance level
Joe Rafaniello
@jrafanie
Jan 07 2016 22:37
@kbrock are you looking into going back to older sprockets to remove the concurrent-ruby dependency?
I wasn't sure if you were looking at that
Alex Krzos
@akrzos
Jan 07 2016 22:38
I have a 25hr run of 5.5.2.0 cooking too to see the reaction of the Schedule Worker over 24 hours
akrzos @akrzos wishes he could "fast forward miq"
Joe Rafaniello
@jrafanie
Jan 07 2016 22:51
@akrzos is it possible to upgrade 5.5.0.13-2 to rhel 7.2?
Alex Krzos
@akrzos
Jan 07 2016 22:56
@jrafanie I can investigate that
Should be fairly easy to upgrade it to rhel 7.2
Joe Rafaniello
@jrafanie
Jan 07 2016 23:03
ok, thanks, that would help eliminate things
Keenan Brock
@kbrock
Jan 07 2016 23:49
@jrafanie I am just fixing our bugs with sprockets - the invalid references and stuff. want to make our app valid / work with all versions of sprockets.
I'm not looking into memory constraints of sprockets itself.
Joe Rafaniello
@jrafanie
Jan 07 2016 23:50
Ok