These are chat archives for ManageIQ/manageiq/performance

21st
Jan 2016
Alex Krzos
@akrzos
Jan 21 2016 00:25
@kbrock Go right ahead
Joe Rafaniello
@jrafanie
Jan 21 2016 17:03
@akrzos FYI, this is a somewhat useful way to analyze what memory mappings are private in a process: grep -E -B 7 "Private.+:\s+([1-9][0-9]|[1-9])" /proc/PID/smaps
produces this...
VmFlags: rd wr mr mw me dw ac
01ab9000-12a8c000 rw-p 00000000 00:00 0                                  [heap]
Size:             278348 kB
Rss:              272752 kB
Pss:              253251 kB
Shared_Clean:       3988 kB
Shared_Dirty:      17908 kB
Private_Clean:       372 kB
Private_Dirty:    250484 kB
--
12a8c000-1c877000 rw-p 00000000 00:00 0                                  [heap]
Size:             161708 kB
Rss:              155272 kB
Pss:              155272 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:    155272 kB
--
...
--
7f97e6255000-7f97e6256000 rw-p 0002d000 fd:00 1140833                    /opt/rh/rh-postgresql94/root/usr/lib64/libpq.so.rh-postgresql94-5.7
Size:                  4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
--
7f97e7316000-7f97e7317000 rw-p 00006000 fd:00 27365228                   /opt/rubies/ruby-2.2.4/lib/ruby/gems/2.2.0/extensions/x86_64-linux/2.2.0/thin-1.6.4/thin_parser.so
Size:                  4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
--
7f97e781e000-7f97e7820000 rw-p 000f1000 fd:00 38137                      /usr/lib64/libstdc++.so.6.0.19
Size:                  8 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
--
7f97e7a91000-7f97e7a94000 rw-p 0005c000 fd:00 10197510                   /opt/rubies/ruby-2.2.4/lib/ruby/gems/2.2.0/extensions/x86_64-linux/2.2.0/eventmachine-1.0.8/rubyeventmachine.so
Size:                 12 kB
Rss:                  12 kB
Pss:                  12 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:        12 kB
--
7f97e7ea9000-7f97e7eaa000 rw-p 0000b000 fd:00 2030049                    /opt/rubies/ruby-2.2.4/lib/ruby/gems/2.2.0/extensions/x86_64-linux/2.2.0/escape_utils-1.1.0/escape_utils/escape_utils.so
Size:                  4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
--
7f97e80b5000-7f97e80b6000 rw-p 0000b000 fd:00 27526896                   /opt/rubies/ruby-2.2.4/lib/ruby/gems/2.2.0/extensions/x86_64-linux/2.2.0/hamlit-2.0.2/hamlit/hamlit.so
Size:                  4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
--
7f97e82e5000-7f97e82e6000 rw-p 0002f000 fd:00 18374058                   /opt/rubies/ruby-2.2.4/lib/ruby/2.2.0/x86_64-linux/ripper.so
Size:                  4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
--
7f97e913a000-7f97e913b000 rw-p 00001000 fd:00 27108859                   /opt/rubies/ruby-2.2.4/lib/ruby/2.2.0/x86_64-linux/digest/md5.so
Size:                  4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
--
7f97e9345000-7f97e9346000 rw-p 0000a000 fd:00 27223848                   /opt/rubies/ruby-2.2.4/lib/ruby/gems/2.2.0/extensions/x86_64-linux/2.2.0/json-1.8.3/json/ext/generator.so
Size:                  4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared
interestingly many shared objects are private and much of the ruby heap is private in a forked child
So, it looks like there's some opportunities to make these shared
Keenan Brock
@kbrock
Jan 21 2016 18:04
that didn't work
Alex Krzos
@akrzos
Jan 21 2016 18:04
vpn
Keenan Brock
@kbrock
Jan 21 2016 18:04
This message was deleted
I'm in the office
hmm
thanks
ugh
jason and I are in the tail end of a demo - hawkular (time series db)
Oleg Barenboim
@chessbyte
Jan 21 2016 18:05
and @dmetzger57 is on PTO today
Alex Krzos
@akrzos
Jan 21 2016 18:05
got it
Oleg Barenboim
@chessbyte
Jan 21 2016 18:06
@akrzos maybe cancel this week's meeting?
Alex Krzos
@akrzos
Jan 21 2016 18:07
@chessbyte I'll be moving the meeting then
Keenan Brock
@kbrock
Jan 21 2016 18:07
@fryguy why did they use a tag for the "cpu_usage" and a common counter vs a cpu_usage counter?
Keenan Brock
@kbrock
Jan 21 2016 18:23
found this: https://docs.influxdata.com/influxdb/v0.9/concepts/schema_and_data_layout/
I assume that these design principles will work for cassandra backed metrics as well
matthewd @matthewd :heart: Graphite, fwiw
Keenan Brock
@kbrock
Jan 21 2016 18:25
yea
I consider graphite and influx the same thing
Oleg Barenboim
@chessbyte
Jan 21 2016 18:26
stupid high-level outside-in question: if we pick a sub-optimal TSDB, how hard/easy is it to move to a different one in a new version of ManageIQ?
Keenan Brock
@kbrock
Jan 21 2016 18:26
cool - there is a time series data modeling class online at datastax
@chessbyte I think easy
of course, all the tooling we add to make our appliance easier will change
Oleg Barenboim
@chessbyte
Jan 21 2016 18:27
if the answer is easy, then we should not over-analyze which TSDB we start with as any of them will be better than PG
Matthew Draper
@matthewd
Jan 21 2016 18:27
I think it depends… if you're going from something that aggregates, to something that keeps per-20-seconds metrics for years, you're going to struggle to recreate that data
Keenan Brock
@kbrock
Jan 21 2016 18:27
@matthewd are you saying the code or the data?
I meant code. didn't think about data
Matthew Draper
@matthewd
Jan 21 2016 18:27
@kbrock the data
Keenan Brock
@kbrock
Jan 21 2016 18:27
:)
Matthew Draper
@matthewd
Jan 21 2016 18:27
The code's easy.. that's just a simple matter of programming :)
Keenan Brock
@kbrock
Jan 21 2016 18:28
just throw some monkeys at the problem...
Oleg Barenboim
@chessbyte
Jan 21 2016 18:28
@matthewd that should be with a :tm:
Matthew Draper
@matthewd
Jan 21 2016 18:29
But that said, I'm deeply (deeply, deeply, deeply) surprised that one can maintain that sort of precise data for long periods, across a bunch of sources, and still analyse it efficiently
Keenan Brock
@kbrock
Jan 21 2016 18:29
+1 - I spent a year on this problem for a trading system
(we used relational behind the scenes though)
but I can see why they want to avoid it
Matthew Draper
@matthewd
Jan 21 2016 18:30
"Cassandra just makes the problem go away" —> many raised eyebrows
Keenan Brock
@kbrock
Jan 21 2016 18:30
also, not totally sure about their examples
Matthew Draper
@matthewd
Jan 21 2016 18:30
I did notice an awful lot of "you could even query over a full 8 hours" type phrasing
(which is why I specifically asked about a multi-year graph)
Keenan Brock
@kbrock
Jan 21 2016 18:31
min(C1, C2) == min(C1) + min(C2) -- I expected min(min(C1),min(C2))
again, working with aggregated data is hard
Matthew Draper
@matthewd
Jan 21 2016 18:31
@kbrock those are two very different metrics
The query shown was min(stack(C1, C2))
Keenan Brock
@kbrock
Jan 21 2016 18:33
aah - thanks
aka - ok. not sure what you said. but at least that makes me feel better.
Matthew Draper
@matthewd
Jan 21 2016 18:34
(no idea whether the other one is something you can answer in a single query.. but you could obviously work it out by separately obtaining min(C1) and min(C2))
Keenan Brock
@kbrock
Jan 21 2016 18:34
yea
but they only ever seemed to use 1 counter/gague
and one of the tags was what I would have put into the counter name
Matthew Draper
@matthewd
Jan 21 2016 18:35
The stacked one does seem to be more likely to be interesting, though… "minimum CPU usage" across four CPUs is not "the lowest any of the CPUs was", but "the lowest observed value for the CPUs combined"
Keenan Brock
@kbrock
Jan 21 2016 18:37
cool - that explains.
min(stack(C1, C2)) ==
min(C1.zip(C2).map {|c1, c2| c1 + c2})
which is not min(*C1, *C2)
Matthew Draper
@matthewd
Jan 21 2016 18:38
But yeah… all my experience is in graphite… but I'm not the one who has to do the thing, so I don't think I get a vote :)
Keenan Brock
@kbrock
Jan 21 2016 18:39
nah - it doesn't work that way. he with the loudest bark...
Jason Frey
@Fryguy
Jan 21 2016 18:41
I'll take all opinions (especially those backed by experience) into account
we have to make the right decision for the project together
one thing I thought was interesting was that I hacked our C&U to write to influx in like 3 hours from scratch (i.e. including installing influx)
I took me 3 hours to try to just install cassandra, which I still couldn't get working
and then I went to install hawkular-metrics and it said I needed a wildfly/eap server and I was like
(╯°□°)╯︵ ┻━┻
Keenan Brock
@kbrock
Jan 21 2016 18:44
cassandra is enterprise software :(
Jason Frey
@Fryguy
Jan 21 2016 18:45
Haha
Greg Blomquist
@blomquisg
Jan 21 2016 18:45
wait, I thought cassandra was webscale
Matthew Draper
@matthewd
Jan 21 2016 18:45
Yeah.. I did wonder exactly what clustering requirements we have. If PG's managed to handle it so far (sorta, obviously)…
Greg Blomquist
@blomquisg
Jan 21 2016 18:45
can something be webscale and enterprise?
Joe Rafaniello
@jrafanie
Jan 21 2016 18:46
I think we need a thought leader to weigh in on that one @blomquisg
Greg Blomquist
@blomquisg
Jan 21 2016 18:46
"I don't care, I want an iPhone 5"
Jason Frey
@Fryguy
Jan 21 2016 18:47
Postgres HA/Clustering is on the TODO list so I assume we at least need a plan for whatever TSDB we pick... Not necessarily needed for day 1
Another thing we have to understand is how it would live in the multi region world we have
Greg Blomquist
@blomquisg
Jan 21 2016 18:48
is there any tenancy requirement we need to impose on it?
or, can we rely on our own tenancy for that?
Matthew Draper
@matthewd
Jan 21 2016 18:48
.. and how that works with the potentially very different multi-region world we're heading for?
Joe Rafaniello
@jrafanie
Jan 21 2016 18:49
Jason Frey
@Fryguy
Jan 21 2016 18:49
@matthewd Yes, that too
@chessbyte to @matthewd's point we also might need to decide if we want to bite off new replication and new TSDB in the same release
@jrafanie that's Ruby 2.1.3, so that shouldn't affect us, right?
Joe Rafaniello
@jrafanie
Jan 21 2016 18:52
yes, it was opened when 2.1.3 was probably the close to latest
either way, their demo script reproduces the same thing on 2.2.4
ruby version 2.2.4
   time   pid message             shared    private
 4.041s 12771 Parent pre GC           71          0
 4.041s 12773 Child  pre GC           71          0
 8.062s 12773 Child  post GC           4         68
 8.080s 12771 Parent post GC           0         72
going to add some logging to see if I can see shared -> private after GC in our workers
Jason Frey
@Fryguy
Jan 21 2016 18:53
GC.start before fork?
Or is it GC causing the object to be copied even when it's going to live?
Also yuck... I thought the bitmap support in 2.0 was supposed to take care of this
Joe Rafaniello
@jrafanie
Jan 21 2016 18:57
yeah, it sounds like specific type of objects
maybe off heap objects or something
and yes, I tried putting a GC.start with bookends to ensure a full GC occurs pre-fork
I was just looking at the smaps of the forked processes and was curious why so much is private
Joe Rafaniello
@jrafanie
Jan 21 2016 22:39
@Fryguy @matthewd here's the results of instrumenting the workers to log their "Shared_Dirty" and "Private_Dirty" totals in relation to GC stat counts:
[----] I, [2016-01-21T17:06:19.448744 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 234276 kB, Private: 3184 kB, GC count / minor / major: 21/15/6
[----] I, [2016-01-21T17:06:20.335699 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 62936 kB, Private: 175348 kB, GC count / minor / major: 22/15/7
[----] I, [2016-01-21T17:06:21.686873 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 62504 kB, Private: 176420 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:22.701741 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 62424 kB, Private: 176500 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:23.716814 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 62384 kB, Private: 176700 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:24.731669 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 62380 kB, Private: 176744 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:25.743980 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 62344 kB, Private: 176936 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:26.758553 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 62184 kB, Private: 177136 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:27.770105 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 62100 kB, Private: 177336 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:28.783232 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 62072 kB, Private: 177524 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:29.798305 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 62036 kB, Private: 177716 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:30.812900 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 62004 kB, Private: 177916 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:31.827542 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 61976 kB, Private: 178116 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:32.839628 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 61940 kB, Private: 178316 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:33.850857 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 61708 kB, Private: 178680 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:34.946440 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 60448 kB, Private: 180116 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:35.962990 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 60368 kB, Private: 180332 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:36.975591 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 60264 kB, Private: 180532 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:37.992879 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 60228 kB, Private: 180712 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:39.004429 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 58176 kB, Private: 182936 kB, GC count / minor / major: 23/15/8
[----] I, [2016-01-21T17:06:40.919216 #20250:74f98c]  INFO -- : MIQ(MiqPriorityWorker.memory_log) Shared: 46060 kB, Private: 197288 kB, GC count / minor / major: 24/16/8
Matthew Draper
@matthewd
Jan 21 2016 22:40
Is that surprising?
Ruby doesn't know you've forked / take special care to leave alone any heap pages that may be shared
Joe Rafaniello
@jrafanie
Jan 21 2016 22:41
yes
yes, but on GC intervals?
or is it a coincidence ?
Matthew Draper
@matthewd
Jan 21 2016 22:42
It's not a coincidence
Joe Rafaniello
@jrafanie
Jan 21 2016 22:43
So, how would one know if our code is modifying the shared memory or ruby's GC?
Matthew Draper
@matthewd
Jan 21 2016 22:44
You might be able to set up instrumentation to run immediately before/after a GC
What rate of sharing are you expecting?
Joe Rafaniello
@jrafanie
Jan 21 2016 22:47
It's not the rate of sharing I'm concerned with... it's the appearance that GC is writing to shared memory causing it to be copied and thus private
Matthew Draper
@matthewd
Jan 21 2016 22:57
Something something SystemTap?
I think I'm alleging that the pages in question are the young-object heap space, and it's therefore unremarkable that the GC is trampling them en masse
To be clear, is this "fork, GC, memory is immediately unshared", or "fork, do stuff, GC, notice memory is unshared"?
Joe Rafaniello
@jrafanie
Jan 21 2016 23:05
yeah, looks like this other post is saying the same thing... make things old objects before forking: http://stackoverflow.com/questions/30353272/garbage-collector-in-ruby-2-2-provokes-unexpected-cow
I gotta head out... I'll look at it later... thanks @matthewd