Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Jason Frey
    @Fryguy
    so it's the same, or not? (You wrote "by contrast" which implies there's a difference)
    Nick LaMuro
    @NickLaMuro
    I will have to check with the logs directly, but maybe "by contrast" was a poor choice of words
    basically, all I can say is the only for sure difference between the two is the amount of time run, and the changes to the codebase
    the first graph is basically we are in a state where the rest of the workers are doing nothing, since we aren't passing messages over DRB
    yet we still seem to be leaking
    I also am unsure about the spike at the beginning of the first graph
    Jason Frey
    @Fryguy
    it's a good approach though...do you think you can comment it line-by-line?
    Joe Rafaniello
    @jrafanie
    yeah, it's worth eliminating as many things as possible
    Nick LaMuro
    @NickLaMuro

    do you think you can comment it line-by-line?

    I think so, but I have some ideas at what it might be just from this test:

    1. What I introduced with the initializer: gist
    2. One of the other methods in in MiqServer#monitor
    3. Possibly Benchmark.realtime_block (would explain why other workers are also leaking slowly)?
    I am leaning towards either removing the initializer (to see if the heisenberg principle is at play), or whittling things down (number 2)
    Jason Frey
    @Fryguy
    Removing Benchmarking should be simple
    Nick LaMuro
    @NickLaMuro
    3 is there just because it was something that would also explain leaks in other workers, and it is the only thing that I noticed that might be related in both (probably going to wait on messing with this, just putting it out there as food for thought)
    @Fryguy yeah, easy, just whatever we choose, we have to wait about 12-24 hours to see the affects
    Jason Frey
    @Fryguy
    ahhhh
    Nick LaMuro
    @NickLaMuro
    though, if @jrafanie is satisfied with the logs that we have currently that the heap isn't growing in ruby on that appliance, removing 1 might be the best first step
    those graphs are generated from top_output, so we already have something that can measure the affects of RSS/VIRT/etc.
    Jason Frey
    @Fryguy
    ah right....removing 3 would imply a Ruby leak, which we don't think it is
    Nick LaMuro
    @NickLaMuro
    yeah, unsure where it is, but if we get down to heartbeat being the last method not commented, then that might be a possible culprit
    Jason Frey
    @Fryguy
    comment out the entirety of monitor_poll ;)
    Nick LaMuro
    @NickLaMuro
    heh
    you mean #monitor right?
    Jason Frey
    @Fryguy
    yeah
    Joe Rafaniello
    @jrafanie
    For more data points...
    Nick LaMuro
    @NickLaMuro
    #monitor_poll would make it go quicker though, but that is probably not another variable we want to deal with
    Joe Rafaniello
    @jrafanie

    So, I'm not sure what's different but @NickLaMuro's appliance seems to be leaking a little faster than @dmetzger57's...

    For Dennis's appliance:

    [----] I, [2017-12-06T03:27:11.137418 #5248:b9d138]  INFO -- : MIQ(MiqServer#monitor_poll) Memory Info XXXX => {RSS: 368912000, PSS: 289559000, thread_count: 24, gc_stats: {:count=>2267, :heap_allocated_pages=>4284, :heap_sorted_length=>4285, :heap_allocatable_pages=>0, :heap_available_slots=>1746159, :heap_live_slots=>1745436, :heap_free_slots=>723, :heap_final_slots=>0, :heap_marked_slots=>1194162, :heap_swept_slots=>81064, :heap_eden_pages=>4284, :heap_tomb_pages=>0, :total_allocated_pages=>4284, :total_freed_pages=>0, :total_allocated_objects=>1291070308, :total_freed_objects=>1289324872, :malloc_increase_bytes=>838968, :malloc_increase_bytes_limit=>16777216, :minor_gc_count=>2219, :major_gc_count=>48, :remembered_wb_unprotected_objects=>60850, :remembered_wb_unprotected_objects_limit=>114182, :old_objects=>1090297, :old_objects_limit=>2047722, :oldmalloc_increase_bytes=>4565680, :oldmalloc_increase_bytes_limit=>16777216}, objspace_count: {:TOTAL=>1746159, :FREE=>670, :T_OBJECT=>197732, :T_CLASS=>18495, :T_MODULE=>2530, :T_FLOAT=>12, :T_STRING=>610593, :T_REGEXP=>2363, :T_ARRAY=>274330, :T_HASH=>78847, :T_STRUCT=>14544, :T_BIGNUM=>363, :T_FILE=>205, :T_DATA=>110419, :T_MATCH=>43827, :T_COMPLEX=>1, :T_RATIONAL=>3904, :T_SYMBOL=>4626, :T_IMEMO=>376098, :T_ICLASS=>6600}}
    [----] I, [2017-12-06T11:21:57.119583 #5248:b9d138]  INFO -- : MIQ(MiqServer#monitor_poll) Memory Info XXXX => {RSS: 414292000, PSS: 334940000, thread_count: 24, gc_stats: {:count=>3379, :heap_allocated_pages=>4284, :heap_sorted_length=>4285, :heap_allocatable_pages=>0, :heap_available_slots=>1746159, :heap_live_slots=>1745482, :heap_free_slots=>677, :heap_final_slots=>0, :heap_marked_slots=>1209781, :heap_swept_slots=>311048, :heap_eden_pages=>4284, :heap_tomb_pages=>0, :total_allocated_pages=>4284, :total_freed_pages=>0, :total_allocated_objects=>1927089623, :total_freed_objects=>1925344141, :malloc_increase_bytes=>779184, :malloc_increase_bytes_limit=>16777216, :minor_gc_count=>3310, :major_gc_count=>69, :remembered_wb_unprotected_objects=>62474, :remembered_wb_unprotected_objects_limit=>114206, :old_objects=>1105086, :old_objects_limit=>2048304, :oldmalloc_increase_bytes=>5276384, :oldmalloc_increase_bytes_limit=>16777216}, objspace_count: {:TOTAL=>1746159, :FREE=>624, :T_OBJECT=>205910, :T_CLASS=>18516, :T_MODULE=>2530, :T_FLOAT=>12, :T_STRING=>610680, :T_REGEXP=>2363, :T_ARRAY=>269067, :T_HASH=>78825, :T_STRUCT=>14477, :T_BIGNUM=>351, :T_FILE=>208, :T_DATA=>109289, :T_MATCH=>43797, :T_COMPLEX=>1, :T_RATIONAL=>3351, :T_SYMBOL=>4587, :T_IMEMO=>374971, :T_ICLASS=>6600}}

    ~8 hours
    RSS +45 MB
    PSS +45 MB
    thread count: no change
    heap_sorted_length: no change
    heap_live_slots: +46 slots (basically nothing)

    For NickL's appliance:

    [----] I, [2017-12-06T03:22:12.923382 #5923:129713c]  INFO -- : MIQ(MiqServer#monitor_poll) Memory Info XXXX => {RSS: 741952000, PSS: 585824000, thread_count: 30, gc_stats: {:count=>10480, :heap_allocated_pages=>4700, :heap_sorted_length=>5099, :heap_allocatable_pages=>0, :heap_available_slots=>1915728, :heap_live_slots=>1530577, :heap_free_slots=>385151, :heap_final_slots=>0, :heap_marked_slots=>1319595, :heap_swept_slots=>286346, :heap_eden_pages=>4474, :heap_tomb_pages=>226, :total_allocated_pages=>4700, :total_freed_pages=>0, :total_allocated_objects=>3063834210, :total_freed_objects=>3062303633, :malloc_increase_bytes=>10484176, :malloc_increase_bytes_limit=>33554432, :minor_gc_count=>9943, :major_gc_count=>537, :remembered_wb_unprotected_objects=>54739, :remembered_wb_unprotected_objects_limit=>109460, :old_objects=>1222317, :old_objects_limit=>2444412, :oldmalloc_increase_bytes=>35144208, :oldmalloc_increase_bytes_limit=>70727545}, objspace_count: {:TOTAL=>1915728, :FREE=>385098, :T_OBJECT=>159315, :T_CLASS=>19163, :T_MODULE=>2631, :T_FLOAT=>12, :T_STRING=>559164, :T_REGEXP=>2458, :T_ARRAY=>243886, :T_HASH=>73062, :T_STRUCT=>13526, :T_BIGNUM=>136, :T_FILE=>128, :T_DATA=>92059, :T_MATCH=>17742, :T_COMPLEX=>1, :T_RATIONAL=>2634, :T_SYMBOL=>3726, :T_IMEMO=>333875, :T_ICLASS=>7112}}
    [----] I, [2017-12-06T11:34:12.820998 #5923:129713c]  INFO -- : MIQ(MiqServer#monitor_poll) Memory Info XXXX => {RSS: 796092000, PSS: 614641000, thread_count: 30, gc_stats: {:count=>13111, :heap_allocated_pages=>4700, :heap_sorted_length=>5099, :heap_allocatable_pages=>0, :heap_available_slots=>1915728, :heap_live_slots=>1744613, :heap_free_slots=>171115, :heap_final_slots=>0, :heap_marked_slots=>1325536, :heap_swept_slots=>319202, :heap_eden_pages=>4490, :heap_tomb_pages=>210, :total_allocated_pages=>4700, :total_freed_pages=>0, :total_allocated_objects=>3829146031, :total_freed_objects=>3827401418, :malloc_increase_bytes=>12700016, :malloc_increase_bytes_limit=>33554432, :minor_gc_count=>12420, :major_gc_count=>691, :remembered_wb_unprotected_objects=>55046, :remembered_wb_unprotected_objects_limit=>109346, :old_objects=>1227807, :old_objects_limit=>2442046, :oldmalloc_increase_bytes=>72646944, :oldmalloc_increase_bytes_limit=>74343475}, objspace_count: {:TOTAL=>1915728, :FREE=>171062, :T_OBJECT=>165015, :T_CLASS=>19175, :T_MODULE=>2631, :T_FLOAT=>12, :T_STRING=>666079, :T_REGEXP=>2458, :T_ARRAY=>262742, :T_HASH=>81650, :T_STRUCT=>13914, :T_BIGNUM=>267, :T_FILE=>192, :T_DATA=>97860, :T_MATCH=>34969, :T_COMPLEX=>1, :T_RATIONAL=>2816, :T_SYMBOL=>3726, :T_IMEMO=>384047, :T_ICLASS=>7112}}

    ~8 hours
    RSS: +54 MB
    PSS: +28 MB
    thread count: no change
    heap_sorted_length: no change
    heap_live_slots: +214,036 slots (+14%)

    Nick LaMuro
    @NickLaMuro
    @dmetzger57 's appliance might be an idle workload, where mine has 6 C&U workers
    3 Processors, and 3 Collectors
    or maybe it is just 2 (one of each)
    let me check
    Joe Rafaniello
    @jrafanie
    But anyway... they both grew in spite of the "inside" ruby heap not growing
    Jason Frey
    @Fryguy
    @jrafanie Is this only reproducible on an appliance?
    Joe Rafaniello
    @jrafanie
    so far, yes
    Nick LaMuro
    @NickLaMuro
    alright, I am going to delete that initializer and restart the appliance if no one has any objections
    Dennis Metzger
    @dmetzger57
    @NickLaMuro if it’s the appliance I pointed @jrafanie to yesterday it is an idle workload, no providers under management
    Joe Rafaniello
    @jrafanie
    yeah, that's the one I was looking at @dmetzger57
    Keenan Brock
    @kbrock
    @NickLaMuro do you need the ObjectSpace portion?
    (as you know) I am always wary of that stuff
    Joe Rafaniello
    @jrafanie
    A little birdy suggested that we can clone your vm @NickLaMuro, and test various amounts of commenting out of the server process code
    Brandon Dunne
    @bdunne
    :wink:
    Nick LaMuro
    @NickLaMuro
    if that birdy rhymes with "Menace", then sure, I am confident that makes sense
    or maybe it rhymes with... okay, I got nothing if it was @bdunne (I just assumed it was @dmetzger57 )
    Joe Rafaniello
    @jrafanie
    Yeah, nothing funny rhymes with Brandon. Believe me, I tried.
    Dennis Metzger
    @dmetzger57
    “Dennis the Menace” haven’t heard that one in decades
    Jason Frey
    @Fryguy
    Dunne is ripe for puns though
    Keenan Brock
    @kbrock
    random kinda rhymes with brandon
    Jason Frey
    @Fryguy
    and on that note, I'm dunne here... :running:
    Keenan Brock
    @kbrock
    heh
    Dennis Metzger
    @dmetzger57
    I almost spit up my coffee, thanks @Fryguy
    Joe Rafaniello
    @jrafanie
    yeah, leave @Fryguy