These are chat archives for ManageIQ/manageiq/performance

16th
Jan 2018
Joe Rafaniello
@jrafanie
Jan 16 2018 00:16
I need to run @NickLaMuro scripts but the num cpu call in a loop was growing on glibc malloc but not jemalloc
Or more high level, MiqWorker.status_update_all in a loop in rails console with a server and workers running in the background
Nick LaMuro
@NickLaMuro
Jan 16 2018 00:31
huh, interesting
Joe Rafaniello
@jrafanie
Jan 16 2018 00:36
That's why I'd love to see if these leaks you're seeing are sustained
Nick LaMuro
@NickLaMuro
Jan 16 2018 00:59
well, I killed most of the scripts I was running from over the weekend, but they were going strong when I did (I have the logs if you want me to PM you the details)
I also have a hotfix in place where I am removing the impact of the require from the MiqSystem.num_cpus running on three appliances at the moment
Joe Rafaniello
@jrafanie
Jan 16 2018 03:36
Ah, if that memorized require hotfix is the only change in effect, that should give us a better idea of it's impact over many hours. That should work.
Nick LaMuro
@NickLaMuro
Jan 16 2018 03:57
yeah. I also have one of the servers running with the 40.times { sync_workers } code in place
Beni Cherniavsky-Paskin
@cben
Jan 16 2018 11:57

load leaks, even on bare ruby (2.3.4p301 (2017-03-30 revision 58214) [x86_64-linux]):

touch empty.rb
ruby -e "1_000.times { 1_000.times { load './empty.rb' }; GC.start; STDOUT.write('.') }"

(run top, VIRT keeps growing)
require itself doesn't leak, even without GC.start:

ruby -e "1_000.times { 1_000.times { require './empty.rb' }; STDOUT.write('.') }"

However, same require loop in rails console leaks like crazy (0.4G -> 2.4G in one minute)!

also with GC.

$ bundle exec rails c
...
[1] pry(main)> 1_000.times { 1_000.times { require './empty.rb' }; GC.start; STDOUT.write('.') }

[require under rails c also feels way slower than bare ruby, but that's probably the load path, bootsnap would help, but that's off topic]

Joe Rafaniello
@jrafanie
Jan 16 2018 13:06
Can you try with newer rubies? Eliminate console and pry: require config/environment in your script and run your one line script with ruby
I'll try it later with malloc vs jemalloc
Does anyone know what objects in ObjectSpace.count_objects parsed code gets stored in, ie RubyVM Instruction sequences
It would also be interesting if the code manifested itself as a string duplicated several times
Beni Cherniavsky-Paskin
@cben
Jan 16 2018 13:24
2.4.1: same results (with pry). no time to play more now, will install 2.5 later.
Joe Rafaniello
@jrafanie
Jan 16 2018 16:00
@cben @NickLaMuro I tested ruby 2.3.6, 2.4.3 and 2.5.0 with and without rails loaded. Something crazy is definitely going on
Note, this is on osx, doing my appliance next

rails

ruby -e "require './config/environment'; puts Process.pid; 1_000.times { 1_000.times { require './empty.rb' }; GC.start; STDOUT.write('.') }"

2.5.0 143 MB (much slower and doesn't leak)
2.4.3 800MB and rising and I killed it (faster than ruby 2.5.0)
2.3.6 800MB and rising and I killed it (faster than ruby 2.5.0)

just ruby

ruby -e "puts Process.pid; 1_000.times { 1_000.times { require './empty.rb' }; GC.start; STDOUT.write('.') }"

2.5.0 4.8 MB much much faster than rails require
2.4.3 4.9 MB much much faster than rails require
2.3.6 5.5 MB much much faster than rails require
Nick LaMuro
@NickLaMuro
Jan 16 2018 16:02
@jrafanie when you say "much faster", are you saying how much faster it takes to execute?
Joe Rafaniello
@jrafanie
Jan 16 2018 16:02
yes
I can time a small number of iterations for comparison
Keenan Brock
@kbrock
Jan 16 2018 16:02
@jrafanie how much memory for ruby -e "require './config/environment'; puts Process.pid ; sleep(20)" ?
Nick LaMuro
@NickLaMuro
Jan 16 2018 16:02
that is interesting, because usually bundler makes require faster, because it removes the slowness that is added by rubygems's require patches
Keenan Brock
@kbrock
Jan 16 2018 16:03
hard to know if the require is increasing those 800MB or if it is rails
Nick LaMuro
@NickLaMuro
Jan 16 2018 16:03
about 120MB of that is probably just our app getting loaded
most of my processes start at 140MB on my machine
Keenan Brock
@kbrock
Jan 16 2018 16:04
@NickLaMuro have you removed the relative require and seen if that helps our memory situation?
comment suggests that it is only needed for eval and irb
Nick LaMuro
@NickLaMuro
Jan 16 2018 16:04
I have done my require patch on an appliance, going to get some recent graphs...
Before
20180114_25131.png
After
20180115_11482.png
But... there is a caveat
This is still the heavily commented out code, so we are not doing a lot other than MiqWorker.status_update_all and MiqServer.sync_workers
so re enabling all of that code might show a different leak
but this one in particular, clearly is better
Joe Rafaniello
@jrafanie
Jan 16 2018 16:09
@NickLaMuro here's a smaller number of iterations with timing and memory usage

ruby2.3.6

11:07:20 ~/Code/manageiq (master) (2.3.6) + time ruby -e "require './config/environment'; puts Process.pid; 300.times { 300.times { require './empty.rb' }; GC.start; STDOUT.write('.') }"
** Using session_store: ActionDispatch::Session::MemCacheStore
92309
............................................................................................................................................................................................................................................................................................................
ruby -e   31.34s user 2.95s system 97% cpu 35.186 total
225 MB

ruby2.4.3

11:04:17 ~/Code/manageiq (master) (2.4.3) + time ruby -e "require './config/environment'; puts Process.pid; 300.times { 300.times { require './empty.rb' }; GC.start; STDOUT.write('.') }"
** Using session_store: ActionDispatch::Session::MemCacheStore
92257
............................................................................................................................................................................................................................................................................................................
ruby -e   29.05s user 2.56s system 99% cpu 31.734 total
213 MB

ruby2.5.0

11:05:11 ~/Code/manageiq (master) (2.5.0) + time ruby -e "require './config/environment'; puts Process.pid; 300.times { 300.times { require './empty.rb' }; GC.start; STDOUT.write('.') }"
** Using session_store: ActionDispatch::Session::MemCacheStore
92279
............................................................................................................................................................................................................................................................................................................
ruby -e   43.60s user 15.68s system 99% cpu 59.491 total
134 MB
ruby 2.3 and 2.4 memory did not stabilize, it was still rising, whereas 2.5.0 reached ~130 MB and stayed in that area
(ruby 2.5.0 is roughly twice as slow but doens't seem to leak)
Nick LaMuro
@NickLaMuro
Jan 16 2018 16:11
yeah, that does seem to make sense
what I was noticing in the mem dumps was basically the same memory, and no real pointers into the user space code
Joe Rafaniello
@jrafanie
Jan 16 2018 16:11
I gotta look at something else right now but I was going to try vanilla rails versions next
Nick LaMuro
@NickLaMuro
Jan 16 2018 16:12
that would be a good test
Joe Rafaniello
@jrafanie
Jan 16 2018 16:12
and see if any rails versions have this problem... to eliminate us or them
If anyone does that before me, let me know so I don't do it. I'll ping the room if I come back and start on that
Nick LaMuro
@NickLaMuro
Jan 16 2018 16:16
I would rather you do it since it would keep consistency with what you did on your machine
I can compare everything again on my machine after, but I am currently looking into where this memory in the require is coming from
Joe Rafaniello
@jrafanie
Jan 16 2018 16:22
ok, cool
will do
Joe Rafaniello
@jrafanie
Jan 16 2018 16:41
@NickLaMuro vanilla rails 5.0.6 application looks completely fine... 47-51 MB, very similar timing
chruby 2.3; bundle > /dev/null;
time ruby -e "require './config/environment'; puts Process.pid; 300.times { 300.times { require './empty.rb' }; GC.start; STDOUT.write('.') }";

chruby 2.4; bundle > /dev/null;
time ruby -e "require './config/environment'; puts Process.pid; 300.times { 300.times { require './empty.rb' }; GC.start; STDOUT.write('.') }";

chruby 2.5; bundle > /dev/null;
time ruby -e "require './config/environment'; puts Process.pid; 300.times { 300.times { require './empty.rb' }; GC.start; STDOUT.write('.') }";


93679
............................................................................................................................................................................................................................................................................................................
ruby -e   6.63s user 1.11s system 95% cpu 8.083 total
93692
............................................................................................................................................................................................................................................................................................................
ruby -e   6.06s user 1.04s system 96% cpu 7.347 total
93705
............................................................................................................................................................................................................................................................................................................
ruby -e   6.69s user 1.07s system 96% cpu 8.067 total
Although pure ruby crushes it (4-5 MB)
chruby 2.3; bundle > /dev/null;
time ruby -e "puts Process.pid; 300.times { 300.times { require './empty.rb' }; GC.start; STDOUT.write('.') }";

chruby 2.4; bundle > /dev/null;
time ruby -e "puts Process.pid; 300.times { 300.times { require './empty.rb' }; GC.start; STDOUT.write('.') }";

chruby 2.5; bundle > /dev/null;
time ruby -e "puts Process.pid; 300.times { 300.times { require './empty.rb' }; GC.start; STDOUT.write('.') }";


93732
............................................................................................................................................................................................................................................................................................................
ruby -e   1.22s user 0.68s system 99% cpu 1.905 total
93735
............................................................................................................................................................................................................................................................................................................
ruby -e   1.09s user 0.66s system 99% cpu 1.764 total
93738
............................................................................................................................................................................................................................................................................................................
ruby -e   0.91s user 0.62s system 99% cpu 1.536 total
Time to bisect $LOADED_FEATURES in our app
Jason Frey
@Fryguy
Jan 16 2018 18:52
@jrafanie are you doing the require-linux-admin-in-a-loop test on those scenarios?
Joe Rafaniello
@jrafanie
Jan 16 2018 18:53
No, you can see what's in the ruby -e, just requiring an empty rb file
Nick LaMuro
@NickLaMuro
Jan 16 2018 19:06

@jrafanie can I have you test something. can you change the ruby in the command to this:

require './config/environment'; puts Process.pid; f = './empty.rb'; 300.times { 300.times { require f }; GC.start; STDOUT.write('.') }"

And let me know if you still see the leak in our setup? (SPOILER: You shouldn't)

Jason Frey
@Fryguy
Jan 16 2018 19:08
I would like to try vanilla rails + the require linux_admin loop
Joe Rafaniello
@jrafanie
Jan 16 2018 19:10
@NickLaMuro it grows to 208 MB by the time it finishes, which is very similar to what we had before :point_up: January 16, 2018 11:09 AM
Nick LaMuro
@NickLaMuro
Jan 16 2018 19:11
huh... weird..
I did not have those results... hrm....
well... writing some python at the moment... so I will try myself when I am done with that, but I trust your word for now
Joe Rafaniello
@jrafanie
Jan 16 2018 19:13
I did it on a 2.3.1 appliance too, reaches 220 MB
this is what I ran and watched top:
ruby -e "require './config/environment'; puts Process.pid; f = './empty.rb'; 300.times { 300.times { require f }; GC.start; STDOUT.write('.') }"
Joe Rafaniello
@jrafanie
Jan 16 2018 19:19

I would like to try vanilla rails + the require linux_admin loop

Seems fine with vanilla rails:

02:16:11 ~/Code/playground/vanilla_506 (2.3.6) + time ruby -e "require './config/environment'; puts Process.pid; f = 'linux_admin'; 300.times { 300.times { require f }; GC.start; STDOUT.write('.') }"
2717
............................................................................................................................................................................................................................................................................................................
ruby -e   5.29s user 0.35s system 98% cpu 5.750 total
~47 MB
@Fryguy ^
Jason Frey
@Fryguy
Jan 16 2018 19:19
that;s not exactly the same test, but ok
I meant literally run nick's .rb file using vanilla rails
Joe Rafaniello
@jrafanie
Jan 16 2018 19:21
you didn' tsay that duh :-)
which script?
Jason Frey
@Fryguy
Jan 16 2018 19:21
"the require linux_admin loop" is my name for nick's script haha
Joe Rafaniello
@jrafanie
Jan 16 2018 19:22
let's keep the shiny objects to the minimum... be explicit ;-)
Jason Frey
@Fryguy
Jan 16 2018 19:22
couldn't you tell? :trollface:
I have a million tabs open and I can't find it ༼ ༎ຶ ෴ ༎ຶ༽
Nick LaMuro
@NickLaMuro
Jan 16 2018 19:24
one sec
Jason Frey
@Fryguy
Jan 16 2018 19:24
thank you nick
did you fix it for GC'ing? (it wasn't actually hitting the GC lines for me)
Nick LaMuro
@NickLaMuro
Jan 16 2018 19:25
yeah, that was what I was talking about adding the do_gc = nil prior to starting the loop
seemed to not like having that undefined prior to the loop running, but didn't seem to bark or error either
I am sure there is a ruby explanation for it, but for now, I just fixed it and moved on
Thanks for catching that being an issue though
Joe Rafaniello
@jrafanie
Jan 16 2018 19:36
@Fryguy so far, "nick's script" is stable at 53.6 MB and still running. I had to modify it to be run in vanilla rails (comment out gems pending and miq-process)
head 17_test_require_linux_admin.rb
require "./config/environment" unless ENV["WITHOUT_MIQ_ENV"]
#require "manageiq-gems-pending"
#require "util/miq-process"
require "active_support/all"
(ruby 2.3.6 by the way)
the same exact script copied to manageiq with ruby 2.3.6 rises to 141.1 MB and is now stable
Joe Rafaniello
@jrafanie
Jan 16 2018 21:39
@NickLaMuro I have something interesting... I tried using ruby-prof to compare "vanilla rails + require './empty'" to our rails doing the same...

vanilla rails + require './empty'

Measure Mode: wall_time
Thread ID: 7942460
Fiber ID: 19331180
Total: 11.667656
Sort by: self_time

 %self      total      self      wait     child     calls  name
 83.39      9.729     9.729     0.000     0.000      300   <Module::GC>#start
 12.95      1.511     1.511     0.000     0.000    90000   Kernel#require
  1.58      1.787     0.184     0.000     1.602    90000   ActiveSupport::Dependencies::Loadable#load_dependency
  0.69      1.867     0.081     0.000     1.787    90000   ActiveSupport::Dependencies::Loadable#require
  0.46      0.092     0.053     0.000     0.038    90000   ActiveSupport::Dependencies#load?
  0.33      0.038     0.038     0.000     0.000    90000   <Module::ActiveSupport::Dependencies>#mechanism
  0.02     11.668     0.002     0.000    11.665      301  *Integer#times
  0.00     11.668     0.000     0.000    11.668        1   [global]#[no method]
  0.00      0.000     0.000     0.000     0.000        2   IO#set_encoding
Now, for our rails...

Our rails + require './empty'

Measure Mode: wall_time
Thread ID: 18313520
Fiber ID: 19646240
Total: 33.885063
Sort by: self_time

 %self      total      self      wait     child     calls  name
 79.87     27.065    27.065     0.000     0.000      300   <Module::GC>#start
 14.87      6.179     5.040     0.000     1.139    90000   Kernel#require
  1.74      1.139     0.588     0.000     0.551  1080000   Pathname#to_path
  0.98      0.551     0.330     0.000     0.221  1080000   Kernel#initialize_dup
  0.75      6.585     0.253     0.000     6.332    90000   ActiveSupport::Dependencies::Loadable#load_dependency
  0.65      0.221     0.221     0.000     0.000  1080000   String#initialize_copy
  0.42      6.727     0.142     0.000     6.585    90000   ActiveSupport::Dependencies::Loadable#require
  0.24      0.153     0.082     0.000     0.071    90000   ActiveSupport::Dependencies#load?
  0.21      0.071     0.071     0.000     0.000    90000   <Module::ActiveSupport::Dependencies>#mechanism
  0.01     33.885     0.003     0.000    33.882      301  *Integer#times
  0.00     33.885     0.000     0.000    33.885        1   [global]#[no method]
  0.00      0.000     0.000     0.000     0.000        2   IO#set_encoding
We're calling Pathname#to_path 1 million+ times within require
Note, this is a linux appliance I ran this on
Joe Rafaniello
@jrafanie
Jan 16 2018 21:44
Do we have a monkey patch or library that tries to convert require "strings" to pathname objects?
Nick LaMuro
@NickLaMuro
Jan 16 2018 21:59

sorry, lunch

Wow, interesting! Not sure about the monkey patch, but let me trying something...

Keenan Brock
@kbrock
Jan 16 2018 22:02
I tried Kernel.methods(:require).source_location - but didn't show too much
was expecting it to reveal where it was defined
I was probably doing it wrong
Joe Rafaniello
@jrafanie
Jan 16 2018 22:03
FYI, 1080000 is 90,000 * 12, so it appears we're doing to_path 12 times per require
Nick LaMuro
@NickLaMuro
Jan 16 2018 22:06
yeah, I figured as much
same with a String#initialize_copy (also Kernel#initialize_dup I guess too)
Keenan Brock
@kbrock
Jan 16 2018 22:07
I remember seeing the to_s in the code nick shared - isn't that a path to a string?
Joe Rafaniello
@jrafanie
Jan 16 2018 22:12
But why is the path to require a Pathname with our rails vs vanilla rails
that appears to be the difference
Nick LaMuro
@NickLaMuro
Jan 16 2018 22:13
That should never get called
bundler undoes all of that, pretty sure
Joe Rafaniello
@jrafanie
Jan 16 2018 22:14
lolz
Nick LaMuro
@NickLaMuro
Jan 16 2018 22:20
the @chrisarcand gif collection strikes again! :zap:
Jason Frey
@Fryguy
Jan 16 2018 22:21
We do put Pathname's into the eager load paths
(sorry if this was said already)
Joe Rafaniello
@jrafanie
Jan 16 2018 22:22
Interesting...
and I think there are a few in some of the gems like ui-classic
Chris Arcand
@chrisarcand
Jan 16 2018 22:24
@jrafanie That's standard Rails; all the root stuff is Pathname, and you can basically give anything that implements to_path to require
Not to mention that both @Fryguy and myself pressure people to use Pathname whenever possible :smirk:
Joe Rafaniello
@jrafanie
Jan 16 2018 22:25
yeah, I do too... :laughing:
Chris Arcand
@chrisarcand
Jan 16 2018 22:25
Prolly my fave standard lib. Stellar API.
Jason Frey
@Fryguy
Jan 16 2018 22:26
I bet if we count them there are probably 12 paths @jrafanie :moneybag:
Joe Rafaniello
@jrafanie
Jan 16 2018 22:26
irb(main):003:0> Rails.application.config.autoload_paths.length
=> 11
Jason Frey
@Fryguy
Jan 16 2018 22:26
soooo close
Joe Rafaniello
@jrafanie
Jan 16 2018 22:26
I left off the last path for savings ;-)
Jason Frey
@Fryguy
Jan 16 2018 22:26
HAHA
Joe Rafaniello
@jrafanie
Jan 16 2018 22:27
I want to make the vanilla rails slow again (like us)
and leaky
Joe Rafaniello
@jrafanie
Jan 16 2018 22:35
nice @Fryguy... closer
Jason Frey
@Fryguy
Jan 16 2018 22:35
?
Joe Rafaniello
@jrafanie
Jan 16 2018 22:36
on vanilla rails app with 11 autoload Pathames
Measure Mode: wall_time
Thread ID: 18018620
Fiber ID: 22283900
Total: 14.033344
Sort by: self_time

 %self      total      self      wait     child     calls  name
 67.27      9.441     9.441     0.000     0.000      300   <Module::GC>#start
 23.42      4.127     3.287     0.000     0.839    90000   Kernel#require
  2.97      0.839     0.416     0.000     0.423   900000   Pathname#to_path
  1.86      0.423     0.261     0.000     0.162   900000   Kernel#initialize_dup
  1.42      4.429     0.200     0.000     4.229    90000   ActiveSupport::Dependencies::Loadable#load_dependency
  1.16      0.162     0.162     0.000     0.000   900000   String#initialize_copy
  0.58      4.510     0.081     0.000     4.429    90000   ActiveSupport::Dependencies::Loadable#require
  0.41      0.103     0.058     0.000     0.045    90000   ActiveSupport::Dependencies#load?
  0.32      0.045     0.045     0.000     0.000    90000   <Module::ActiveSupport::Dependencies>#mechanism
  0.02     14.033     0.002     0.000    14.031      301  *Integer#times
  0.00     14.033     0.000     0.000    14.033        1   [global]#[no method]
  0.00      0.000     0.000     0.000     0.000        2   IO#set_encoding
The numbers don't match but at least we have pathnames
Jason Frey
@Fryguy
Jan 16 2018 22:36
yup
Nick LaMuro
@NickLaMuro
Jan 16 2018 22:39
@jrafanie when are you triggering ruby prof?
Jason Frey
@Fryguy
Jan 16 2018 22:40
@jrafanie try 12 Pathnames ;)
Joe Rafaniello
@jrafanie
Jan 16 2018 22:40
@NickLaMuro
[root@localhost test_zomg]# cat test.rb
require 'ruby-prof'
require './config/environment'
puts Process.pid
f = './empty'
RubyProf.start
300.times { 300.times { require f }; GC.start }
result = RubyProf.stop
printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT)
Oh well, I'm done for now, I'll probably look again a bit later
Jason Frey
@Fryguy
Jan 16 2018 22:42
:microphone: :droplet: