These are chat archives for ManageIQ/manageiq/performance

2nd
Oct 2015
Matthew Draper
@matthewd
Oct 02 2015 14:42
If memory is actively a concern, have we eliminated the possibility that the STI loader is at fault?
Joe Rafaniello
@jrafanie
Oct 02 2015 14:53
@matthewd it’s concerning, but it doesn’t seem to affect the refresh code compared to the sheer number of objects that get created + less frequent full GC + ruby not giving back memory to the OS
Alex Krzos
@akrzos
Oct 02 2015 14:54
@jrafanie So should I re-test memory with the bulk connect changes for refresh?
Matthew Draper
@matthewd
Oct 02 2015 14:54
Okay.. as long as it's a single-process memory issue, and not that the sum has grown because the average has gone up
Joe Rafaniello
@jrafanie
Oct 02 2015 14:55
If @akrzos can confirm doing export RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=0.9 then starting his rails console test, it should tell us if the 2.2 GC changes are causing the higher memory for ruby
Matthew Draper
@matthewd
Oct 02 2015 14:55
Because I'm really pretty sure the loader currently loads a lot of stuff the first time it's triggered
Joe Rafaniello
@jrafanie
Oct 02 2015 14:56
@matthewd what is the best way to meause the worst case of the sti loader? kill the cached yaml file and try to constantize a bunch of class names?
Matthew Draper
@matthewd
Oct 02 2015 14:57
I'd start by adding a log message when it loads something
Alex Krzos
@akrzos
Oct 02 2015 14:57
@jrafanie I'll re-run with that GC tuning, did you want the objspace dump again with it or just rss and virt memory usage?
Matthew Draper
@matthewd
Oct 02 2015 14:57
Then probably just check what gets loaded during rails c (hopefully nothing), and what gets loaded when you first mention a single class
Joe Rafaniello
@jrafanie
Oct 02 2015 14:57
I think rss and virt should be enough for that @akrzos
Keep in mind, we’re measuring the effect of a single refresh on the ruby process’s memory
Matthew Draper
@matthewd
Oct 02 2015 14:58
The cached stuff shouldn't matter in the real world, because the cache will never go away on an appliance
Alex Krzos
@akrzos
Oct 02 2015 14:58
so just found an issue for vim broker with my tests that didn't occur in the 5.4 tests
Matthew Draper
@matthewd
Oct 02 2015 14:58
(and will never get out of date, because the source files don't change)
Alex Krzos
@akrzos
Oct 02 2015 14:58
setting cache_scope_ems_refresh occurs ina different file
Joe Rafaniello
@jrafanie
Oct 02 2015 14:59
It would be useful to measure a 1 hour, 2 hour, 3 hour ems refresh worker’s high/low/avg memory usage 5.4 to 5.5
Alex Krzos
@akrzos
Oct 02 2015 14:59
between 5.3/5.4 to 5.5
Alex Krzos
@akrzos
Oct 02 2015 15:00
so my benchmark might be incorrect, however clearly there is still a memory issue with refresh since vim broker is not in use with rhevm
actually I think memory usage might be higher
after I fix this since it's doing a refresh against the vim broker with cache_scope_core rather than cache_scope_ems_refresh
Matthew Draper
@matthewd
Oct 02 2015 15:01

Traditionally people may approach a performance regression with the mindset of:

Something made this slower, let's work hard to find out what

Instead I prefer to take the approach of

What can I do to make this slow piece of code faster

I think this is the idea I was dancing around yesterday

Joe Rafaniello
@jrafanie
Oct 02 2015 15:02
@akrzos I think we just have to tweak the GC to do it more often than now, but still less than it did with 2.0
we did find a few new gems that add some overhead to just loading the rails app but it wasn’t huge, maybe 20 MB or so
Alex Krzos
@akrzos
Oct 02 2015 15:06
Alright let me setup a 5.5 appliance to monitor memory growth on the refresh worker
and run the gc tuning experiment
@jrafanie for the ems refresh worker, can you point me to the line of code i should comment out to prevent any new EmsRefreshes from being scheduled so we can let this cook over the weekend?
I'll also set up his restart period to 0 so he should never be restarted
Joe Rafaniello
@jrafanie
Oct 02 2015 15:09

for the ems refresh worker, can you point me to the line of code i should comment out to prevent any new EmsRefreshes from being scheduled so we can let this cook over the weekend?

@akrzos I don’t know what that means… I was thinking we can let the current environment continually refresh based on normal activity… with 2 appliances setup (5.4 and 5.5/upstream) on the same setup, we can get an idea how the memory usage of ruby changes over longer periods of time

Alex Krzos
@akrzos
Oct 02 2015 15:10
ok
@jrafanie I can set it up that way too, so a 5.4 and 5.5 appliance connected to same small vmware environment, there will be an EmsRefresh scheduled every 24 hours then I believe by default
Joe Rafaniello
@jrafanie
Oct 02 2015 15:11
Does that make sense @matthewd @akrzos @dmetzger57 ? I don’t want to ask for something that’s not useful
Jason Frey
@Fryguy
Oct 02 2015 15:11
what are we trying to solve?
Joe Rafaniello
@jrafanie
Oct 02 2015 15:12
@Fryguy measuring memory over time as opposed to just based on a single refresh
Matthew Draper
@matthewd
Oct 02 2015 15:12
@jrafanie I don't really understand why we seem focused on measurement of 5.4
Joe Rafaniello
@jrafanie
Oct 02 2015 15:13
@matthewd it’s the baseline for measuring a memory regression/behavior change I suppose?
Matthew Draper
@matthewd
Oct 02 2015 15:13
I'm sure there are plenty of things using more (or less) memory thanks to the Ruby / Rails / other-random-gem updates… but don't we "just" need to measure what's happening now, and make it not happen?
We must already have a baseline, because we've had a report of a problem, no?
i.e., we already have a goal-post
Joe Rafaniello
@jrafanie
Oct 02 2015 15:16
Maybe we do... I don’t know what an ems refresh worker consumes in terms of memory in @akrzos’ environment on average on 5.4
Alex Krzos
@akrzos
Oct 02 2015 15:17
@matthewd The issue with that is a baseline can mean many things, how many providers, how many workers, etc I think if we have a 5.4 environment running with the 5.5 we can see the difference between the two versions here. My current baseline merely measures the difference of memory usage during a single EmsRefresh
The memory usage over time would help us understand if this memory usage issue extends to something bigger than just a single ems refresh
Matthew Draper
@matthewd
Oct 02 2015 15:19
I think I'm failing to see why we care
If it uses too much memory, we need to make it use less memory
Jason Frey
@Fryguy
Oct 02 2015 15:19
(me too)
I'm more wondering if it's actually a problem at all
Yes, memory increased from 5.4 to 5.5
but we also upgraded to Ruby 2.2 which is known to use more memory
(in certain situations)
Matthew Draper
@matthewd
Oct 02 2015 15:20
Knowing whether it persists on the current code, tells us important diagnostic stuff to help us work out exactly where it's being retained… knowing detail about the previous code tells us.. what?
Jason Frey
@Fryguy
Oct 02 2015 15:20
so if it's really a problem at all, we just make it use less memory
Matthew Draper
@matthewd
Oct 02 2015 15:20
Oh, yeah, I was working on the basis that someone has specifically reported "this now uses too much memory, and is exceeding some threshold", not "hey, I noticed this is using more memory"
Oleg Barenboim
@chessbyte
Oct 02 2015 15:21
@matthewd Performance QE for CloudForms (ie @akrzos) reported his finding that we are using double the memory in 5.5 as we did in 5.4
Matthew Draper
@matthewd
Oct 02 2015 15:22
In total? Or in a refresh worker?
Alex Krzos
@akrzos
Oct 02 2015 15:22
@matthewd I understand what your saying but there is no requirement given that says a refresh worker must never use more than 1GiB of memory
The full scope of the problem isn't entirely understood yet
Jason Frey
@Fryguy
Oct 02 2015 15:23
@akrzos Did you ever run the refresh wrapped in GC.starts?
Alex Krzos
@akrzos
Oct 02 2015 15:23
It's demonstrated easily with refresh
Jason Frey
@Fryguy
Oct 02 2015 15:23
I'm curious if the memory will just :sparkles: go away :sparkles:
Alex Krzos
@akrzos
Oct 02 2015 15:23
@Fryguy not yet I can get that done now though
@Fryguy ok test is running right now
Matthew Draper
@matthewd
Oct 02 2015 15:26
(I should note that all my comments need to be read with an implicit "this isn't my area of responsibility" disclaimer — if Team Perf want to measure a thing, then so be it)
But as @Fryguy said, ruby 2.2 is known to use more memory, as a trade-off for better/more consistent performance
Alex Krzos
@akrzos
Oct 02 2015 15:27
@Fryguy FYI:
mrss_start = MiqProcess.processInfo()[:memory_usage]
gc_start = GC.count
e = ExtManagementSystem.find_by_name('rhemv-small')
GC.start
timing = Benchmark.realtime {EmsRefresh.refresh e}
GC.start
mrss_end = MiqProcess.processInfo()[:memory_usage]
gc_end = GC.count
mrss_change = mrss_end - mrss_start
gc_change = gc_end - gc_start
puts "#{mrss_start}, #{mrss_end}, #{mrss_change}"
puts "#{gc_start}, #{gc_end}, #{gc_change}"
puts Process.pid
timing
Matthew Draper
@matthewd
Oct 02 2015 15:29
So if we're going to treat "higher memory usage" as a problem, which we seem to be doing, then we're going to need to know what our goal is… and then we're going to need to work out what's allocating the most in 5.5
Because the same allocations in 5.4 would have been less costly
Alex Krzos
@akrzos
Oct 02 2015 15:29
@matthewd Agree we need a definable goal for it
Matthew Draper
@matthewd
Oct 02 2015 15:30
The regression, if we're treating it as one, is in the change of interpreter, and likely not some single new line of code that's doing allocations we weren't before
Alex Krzos
@akrzos
Oct 02 2015 15:30
Perhaps goal could be measured with the expected size of a provider we can manage with a single appliance
because with higher memory usage the size of the provider we can manage with appliance defaults (6GiB memory) now decreases
Matthew Draper
@matthewd
Oct 02 2015 15:31
Agreed that sounds like a sensible metric
Though if we don't currently have such a metric, I guess I still don't see why a change in memory consumption is more than a curiosity
If we don't know what we're aiming for, how do we know the current change is "bad"?
Oleg Barenboim
@chessbyte
Oct 02 2015 15:32
@matthewd if workers use more memory, then we can probably have fewer workers running concurrently on each appliance -- thus customers would need to run more appliances to get back to old number of workers -- and more appliances have an actual cost on the infrastructure on which they are running
Joe Rafaniello
@jrafanie
Oct 02 2015 15:33
@matthewd I’m going with the let’s measure what we don’t know for comparison… because a single refresh, while interesting is not what appliances are used for
Oleg Barenboim
@chessbyte
Oct 02 2015 15:33
it's one thing if there is a 10% bump -- quite another if we are doubling memory, thus, probably, halving the number of workers
Matthew Draper
@matthewd
Oct 02 2015 15:34
@chessbyte that's only a problem if the workers aren't running twice as fast
(I'm not saying I think they are… but we seem to be measuring something other than what we care about)
Alex Krzos
@akrzos
Oct 02 2015 15:35
@matthewd I have been measuring both sides of the equation though
Matthew Draper
@matthewd
Oct 02 2015 15:35
.. which is useful once we're trying to track down the source of the overall regression, but doesn't seem like the something we need to optimize in isolation
Alex Krzos
@akrzos
Oct 02 2015 15:35
timing + memory usage
Joe Rafaniello
@jrafanie
Oct 02 2015 15:35
we care about memory usage because we configure appliances worker counts based on memory usage
Oleg Barenboim
@chessbyte
Oct 02 2015 15:35
customers care about how many appliances they have to stand up for CloudForms to work for their environment
one thing to say 18-20 -- quite another to say 40 with the new version
there is not only a cost to running the appliances, there is also an administrative cost to upgrading them (because our upgrade process sucks)
Joe Rafaniello
@jrafanie
Oct 02 2015 15:36
@matthewd, as I think you’re implying, it would be useful to measure worker speed/throughput in addition to memory
Matthew Draper
@matthewd
Oct 02 2015 15:37
So, is our target to match the memory usage of 5.4?
Joe Rafaniello
@jrafanie
Oct 02 2015 15:37
but only the latter is what users and as as developers have ever configured
Matthew Draper
@matthewd
Oct 02 2015 15:37
Or, actually, botvinnik, because this is a ManageIQ channel
Oleg Barenboim
@chessbyte
Oct 02 2015 15:37
botvinnik vs capablanca is fine
Joe Rafaniello
@jrafanie
Oct 02 2015 15:38
My goal is not botvinnik, ruby 2.2 will never measure the same unless we disable the generational GC
or throw out half our gems
or tirelessly eliminate queries, add selects, etc.
Matthew Draper
@matthewd
Oct 02 2015 15:40
Do we have a goal, then? Or just "less"?
Alex Krzos
@akrzos
Oct 02 2015 15:40
@Fryguy So first run on vmware-small shows about 10MiB less with GC.start wrapping the Refresh
Oleg Barenboim
@chessbyte
Oct 02 2015 15:41
as I wrote botvinnik vs capablanca, I was reminded of their very famous game: https://en.wikipedia.org/wiki/Botvinnik_versus_Capablanca,_AVRO_1938
Alex Krzos
@akrzos
Oct 02 2015 15:41
We need to understand if GC.start on 5.5 is doing a major or minor GC run
Looks like you can pass arguments to make it do a minor GC run and a full sweep afterwards
Joe Rafaniello
@jrafanie
Oct 02 2015 15:44
I think it depends on that other value I was sending around, RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR, which I think defaults to 1.2
 * * RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR (new from 2.1.1)
 *   - Do full GC when the number of old objects is more than R * N
 *     where R is this factor and
 *           N is the number of old objects just after last full GC.
from here
Keenan Brock
@kbrock
Oct 02 2015 16:02
when I rails console 5.4 using ruby 2.2 and rails 4.2.4, it does bump a little, but not as much as when I rails console master
Alex Krzos
@akrzos
Oct 02 2015 16:20
@jrafanie FYI using export RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=0.9 didn't change RSS Memory utilization still at 94MiB for VMware-small
Let me snapshot GC stat to be sure it is doing only major GC runs
Joe Rafaniello
@jrafanie
Oct 02 2015 16:23
@akrzos can you "refresh" my memory, what was it on 5.4 and 5.5 as is for the same refresh on vmware-small?
Alex Krzos
@akrzos
Oct 02 2015 16:36
@jrafanie 5.4 - Between 58MiB to 63MiB, 5.5 - Between 93MiB to 94MiB
like it "refresh" :smile:
I did check in the rails console ENV to see RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR set
and it was, not sure if there is a method to view GC tunings in the console to verify it is set
Alex Krzos
@akrzos
Oct 02 2015 16:42
So ran with GC.stat and all major GC runs
1 minor
Keenan Brock
@kbrock
Oct 02 2015 16:49
@akrzos for me, starting up rails console shows a big span of memory used up. sometimes a difference of 20% from one run to the next. May want to run the same command in a few different rails console invocations to determine if it is consisten for you
I just asked "what is your rss" on boot
Alex Krzos
@akrzos
Oct 02 2015 16:51
@kbrock Yes I do multiple runs on the benchmarks and typically use the 99%ile for timing values, for memory I think it's better to use the Max. When I setup a large run of the benchmarks I have 4 iterations for each benchmark
Joe Rafaniello
@jrafanie
Oct 02 2015 16:53
@akrzos I had to export the env variable
to check it launch rails console with and without it and the memory usage of that process should be less
Alex Krzos
@akrzos
Oct 02 2015 16:54
Also I'm measuring difference of RSS Memory rather than "total" RSS Memory, meaning the change in Memory from before to after refresh. Let me get an approximate rails console RSS Memory usage just for boot
@jrafanie I did that in the terminal before running
Joe Rafaniello
@jrafanie
Oct 02 2015 16:56
my rails console resident goes from ~160 MB to ~130 on master using after export RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=0.9 in the current shell
Keenan Brock
@kbrock
Oct 02 2015 16:57
@akrzos are you seeing the memory used up by a rails process vary greatly? before it even runs anything?
Do I need to run GC or something to stabalize it?
Alex Krzos
@akrzos
Oct 02 2015 16:57
@kbrock pulling that out of my log right now, I log the memory usage at start of console
so I can see how that varies for 5.5
doesn't look like much so far
Keenan Brock
@kbrock
Oct 02 2015 16:58
thanks
Alex Krzos
@akrzos
Oct 02 2015 17:01
bytes    MiB
Max 178466816 170.1992188
avg 176099828.6 167.9418837
Min 175329280 167.2070313
stdev 502891.516 0.4795947228
not sure how to format that
180 samples
pulled from my run against 5.5.0.1
so pretty much ~167MiB for console start up
Joe Rafaniello
@jrafanie
Oct 02 2015 17:05
Good, that's what I"m seeing here too, if you disabled the generational GC via that env variable, it should drop 20-30 MB
Keenan Brock
@kbrock
Oct 02 2015 17:11
mine is varying much more than that. huh
Joe Rafaniello
@jrafanie
Oct 02 2015 17:11
well, it's very variable, but it should drop
Alex Krzos
@akrzos
Oct 02 2015 17:11
how are you loading up the console? I'm using bundle exec bin/rails c
Joe Rafaniello
@jrafanie
Oct 02 2015 17:12
yes
but i'm exporting the env variable into the shell first
Alex Krzos
@akrzos
Oct 02 2015 17:12
let me look back when I had that set
I'll only have one sample but lets see if RSS was lower
Joe Rafaniello
@jrafanie
Oct 02 2015 17:13
to be clear, this is only to determine how much of the increased memory usage is due to the generational GC trading memory for cpu
@matthewd our routes.rb being loaded jumps ~4 MB RES, ~4 MB VIRT on my mac
This message was deleted
Alex Krzos
@akrzos
Oct 02 2015 17:16
@jrafanie actually just looking at the gist I sent earlier, I'm not seeing that same drop in memory usage with the export, but we do see the drop in minor GC count to 1
Joe Rafaniello
@jrafanie
Oct 02 2015 17:18
Correction, requiring fast_gettext jumps 14 MB, 13 MB res/virt
Fryguy @Fryguy is looking into how to delay load that
Jason Frey
@Fryguy
Oct 02 2015 17:19
but @jrafanie can you verify that if you rm the other languages that you don't get as much of a jump in memory
Joe Rafaniello
@jrafanie
Oct 02 2015 17:19
ok, just reviewing the require logging again to see all the jumps
yeah, let me try that now @Fryguy
Matthew Draper
@matthewd
Oct 02 2015 17:20
@jrafanie "jumps" == before load vs after load, or a comparison between two different setups?
(and is this with normal GC, or magic env?)
Joe Rafaniello
@jrafanie
Oct 02 2015 17:20
jumps == unscientific, MiqProcess.processInfo(Process.pid)[:memory_usage] before and after require
Joe Rafaniello
@jrafanie
Oct 02 2015 17:27
@Fryguy removing the po files and I don't see any memory increase requiring fast_gettext
namely
        deleted:    config/locales/en/manageiq.po
        deleted:    config/locales/ja/manageiq.po
        deleted:    config/locales/nl/manageiq.po
Jason Frey
@Fryguy
Oct 02 2015 17:28
ok cool...i'll investigate the delay loading
probably will need to get a PR into fast_gettext...there's an open issue where people want that feature anyway
Joe Rafaniello
@jrafanie
Oct 02 2015 17:35
@Fryguy, for reference, these are the ones that are reported just before it jumps memory: fast_gettext/vendor/poparser (8 MB)
fast_gettext/translation_repository/base (6 MB)
168 MB RES, 2574 MB VIRT,   1  <+  | /Users/joerafaniello/Code/manageiq/config/initializers/backtrace_silencers.rb  (0.036846)
174 MB RES, 2580 MB VIRT,   3  <+  | | | fast_gettext/translation_repository/base  (0.001942)
174 MB RES, 2580 MB VIRT,   3  <+  | | | fast_gettext/translation_repository/mo  (0.018688)
174 MB RES, 2580 MB VIRT,   2  <+  | | fast_gettext/translation_repository/po  (0.055840)
174 MB RES, 2580 MB VIRT,   2  <+  | | fast_gettext/po_file  (0.018703)
174 MB RES, 2580 MB VIRT,   2  <+  | | fast_gettext/vendor/poparser  (0.021935)
182 MB RES, 2587 MB VIRT,   1  <+  | /Users/joerafaniello/Code/manageiq/config/initializers/fast_gettext.rb  (0.827673)
those must be eagerly loading/parsing the translations
Jason Frey
@Fryguy
Oct 02 2015 17:37
oh I already know where it happens
it's pretty clear in the fast_gettext
it loads all po files and converts them to mo files
and it reads them by using a racc based parser
should be easy to introduce a delayer object
Joe Rafaniello
@jrafanie
Oct 02 2015 17:38
ah, there's still a 6 MB jump for base after removing the po files
167 MB RES, 2582 MB VIRT,   1  <+  | /Users/joerafaniello/Code/manageiq/config/initializers/backtrace_silencers.rb  (0.038011)
173 MB RES, 2588 MB VIRT,   3  <+  | | | fast_gettext/translation_repository/base  (0.001829)
173 MB RES, 2588 MB VIRT,   3  <+  | | | fast_gettext/translation_repository/mo  (0.018523)
173 MB RES, 2588 MB VIRT,   2  <+  | | fast_gettext/translation_repository/po  (0.056811)
173 MB RES, 2588 MB VIRT,   2  <+  | | fast_gettext/po_file  (0.018753)
173 MB RES, 2588 MB VIRT,   1  <+  | /Users/joerafaniello/Code/manageiq/config/initializers/fast_gettext.rb  (0.328553)
Jason Frey
@Fryguy
Oct 02 2015 17:47
I'd expect some kind of jump
Joe Rafaniello
@jrafanie
Oct 02 2015 17:48
@Fryguy to be clear, fast_gettext require is fine, it's the config/initializers/fast_gettext.rb require that's calling into fast_gettext to read the po files and loading base
Jason Frey
@Fryguy
Oct 02 2015 17:49
yeah, but it has to if we want i18n to work at all :P
thanks for the info
Matthew Draper
@matthewd
Oct 02 2015 17:57
I thought we were explictly loading all the files, because they each contain the translation of their own name
Joe Rafaniello
@jrafanie
Oct 02 2015 18:01
any idea if we can require just the parts of fog that we need? require 'fog' in irb goes from 15 MB to 47.5 MB
Jason Frey
@Fryguy
Oct 02 2015 18:02
well, if so, that stinks...but I can see all of them required by the gem up front
I can verify what you're saying @matthewd
@jrafanie Not yet...it's not been fully separated where you can do that
right now fog requires all the pluggable bits
in the future they plan to flip that and have the indivudal gems require fog-core
Oleg Barenboim
@chessbyte
Oct 02 2015 18:06
same thing we need to do for our providers ;)
Joe Rafaniello
@jrafanie
Oct 02 2015 18:06
Nice one @chessbyte, very true
Keenan Brock
@kbrock
Oct 02 2015 18:08
@Fryguy for get text can we convert a po object to a mo object at build time, so we don't have to do that conversion every time?
Alex Krzos
@akrzos
Oct 02 2015 19:18
@kbrock so for kicks I ran this for i inseq 1 20; do bundle exec bin/rails r 'puts MiqProcess.processInfo()[:memory_usage]' | tee -a console_memory.out; done on 5.5 and 5.4 for quick comparsion
Keenan Brock
@kbrock
Oct 02 2015 19:32
hmm
I was looking at ObjectSpace
Alex Krzos
@akrzos
Oct 02 2015 20:02
so definitely more stuff loaded by the console, although my measurements on RSS were difference in the BZ and didn't include the console itself
Keenan Brock
@kbrock
Oct 02 2015 20:06
good one
Alex Krzos
@akrzos
Oct 02 2015 21:58
So just noticed this one
Automate workers even though automate role not on
is that bzed yet?
not sure if it is a bz
That right there is another 300MiB+ RSS Memory on my appliance
fyi
@jrafanie Also setup a simple script to track rss and vsz on 5.4 vs 5.5 attached to a medium sized vmware provider for the weekend
will be fun to see what it looks like monday morning
Joe Rafaniello
@jrafanie
Oct 02 2015 22:04
@akrzos I vaguely recall that about hte automate worker, not sure why it needs to start if the role isn't on though
@gmcculloug might remember
Great, @akrzos re: the memory tracker, we'll see