Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Keenan Brock
    @kbrock
    what? me? surely you wouldn't have seen that!
    Nick LaMuro
    @NickLaMuro
    ಠ_ಠ
    Keenan Brock
    @kbrock
    @NickLaMuro saw this one coming a mile away: kbrock/activerecord-hash_options#15
    Keenan Brock
    @kbrock
    thanks
    Nick LaMuro
    @NickLaMuro
    that was interesting, though surprised to find out it takes python like 3-4x longer to run the exact same kind of code... that was surprising. Wonder if it was how it handles large integers or something that causes the slowness. Or maybe the author was using python 2.7, which is very old at this point.
    Keenan Brock
    @kbrock
    yea, surprised me too. I assume that all python code is faster than ruby.
    So if ruby can't scale, python? forgetttaboutit
    the vm talk was very cool. I like the idea of an embedded ruby operand, like c and asm {}
    Nick LaMuro
    @NickLaMuro
    So didn't have a good place to share this, and while it ultimately was a not useful for my purposes, I think there are some places where this can be helpful to some. That said... it will be a :book:... you have been warned...

    Background

    Effectively, I was working on trying to fetch the User record being authorized/authenticated and any associated records in one fell swoop for any given request, and this was one of the last pieces I was trying to get working.

    The tl; dr is I was adding this scope on the User model:

    class User < ApplicationRecord
      scope :api_includes, -> {
        eager_load(:current_group => [
          :miq_user_role,
          :tenant,
          :entitlement
        ]).select('miq_user_roles.feature_identifiers')
      }
    
      # ...
    end

    And swapping out:

    :miq_user_role,

    from the above scope for:

    {:miq_user_role => :miq_product_features},

    caused a different query to be executed, but more on that below.

    Using array_agg in ActiveRecord

    To start, it turns out you can do things like this with ActiveRecord:

    $ bin/rails c
    ** ManageIQ master, codename: Kasparov
    Loading development environment (Rails 5.2.4.3)
    irb(main):001:0> MiqUserRole.left_joins(:miq_product_features)
                                .select(MiqUserRole.arel_table[Arel.star])
                                .select('array_agg(miq_product_features.identifier) AS feature_identifiers')
                                .group(MiqUserRole.arel_table[:id])
                                .find_by_id(2)
    #   MiqUserRole Load (1.0ms)  SELECT  "miq_user_roles".*, array_agg(miq_product_features.identifier) AS feature_identifiers
    #                             FROM "miq_user_roles"
    #                             LEFT OUTER JOIN "miq_roles_features" ON "miq_roles_features"."miq_user_role_id" = "miq_user_roles"."id"
    #                             LEFT OUTER JOIN "miq_product_features" ON "miq_product_features"."id" = "miq_roles_features"."miq_product_feature_id"
    #                             WHERE "miq_user_roles"."id" = $1
    #                             GROUP BY "miq_user_roles"."id"
    #                             LIMIT $2  [["id", 2], ["LIMIT", 1]]
    #   MiqUserRole Inst Including Associations (0.1ms - 1rows)
    #=> #<MiqUserRole id: 2, name: "EvmRole-administrator", read_only: true, created_at: "2020-07-20 15:32:43", updated_at: "2020-07-20 15:32:43", settings: nil, feature_identifiers: ["add_global_filter", "vm_explorer", "vm_infra_explorer", ...]>
    irb(main):002:0> _.feature_identifiers
    #=> ["add_global_filter", "vm_explorer", "vm_infra_explorer", ...]
    irb(main):003:0>

    And it just works out of the box, converting the MiqProductFeatures.arel_table[:identifiers] directly into an array of strings, without having to make a separate query.

    There is already a method on MiqUserRoles that does this, but trying to make it one query using eager_load isn't actually doable because of the has_many :miq_product_features :join_table => ... relationship, so you get something like this:

    irb(main):005:0> MiqUserRole.eager_load(:miq_product_features).find_by_name("EvmRole-administrator")
    #   SQL (0.7ms)  SELECT  DISTINCT "miq_user_roles"."id"
    #                FROM "miq_user_roles"
    #                LEFT OUTER JOIN "miq_roles_features" ON "miq_roles_features"."miq_user_role_id" = "miq_user_roles"."id"
    #                LEFT OUTER JOIN "miq_product_features" ON "miq_product_features"."id" = "miq_roles_features"."miq_product_feature_id"
    #                WHERE "miq_user_roles"."name" = $1
    #                LIMIT $2
    #                [["name", "EvmRole-administrator"], ["LIMIT", 1]]
    #   SQL (0.9ms)  SELECT "miq_user_roles"."id" AS t0_r0, "miq_user_roles"."name" AS t0_r1,
    #                       "miq_user_roles"."read_only" AS t0_r2, "miq_user_roles"."created_at" AS t0_r3,
    #                       "miq_user_roles"."updated_at" AS t0_r4, "miq_user_roles"."settings" AS t0_r5,
    #                       "miq_product_features"."id" AS t1_r0, "miq_product_features"."identifier" AS t1_r1,
    #                       "miq_product_features"."name" AS t1_r2, "miq_product_features"."description" AS t1_r3,
    #                       "miq_product_features"."feature_type" AS t1_r4, "miq_product_features"."protected" AS t1_r5,
    #                       "miq_product_features"."parent_id" AS t1_r6, "miq_product_features"."created_at" AS t1_r7,
    #                       "miq_product_features"."updated_at" AS t1_r8, "miq_product_features"."hidden" AS t1_r9,
    #                       "miq_product_features"."tenant_id" AS t1_r10
    #                FROM "miq_user_roles"
    #                LEFT OUTER JOIN "miq_roles_features" ON "miq_roles_features"."miq_user_role_id" = "miq_user_roles"."id"
    #                LEFT OUTER JOIN "miq_product_features" ON "miq_product_features"."id" = "miq_roles_features"."miq_product_feature_id"
    #                WHERE "miq_user_roles"."name" = $1 AND "miq_user_roles"."id" = $2 
    #                [["name", "EvmRole-administrator"], ["id", 2]]
    #   MiqUserRole Inst Including Associations (11.0ms - 83rows)
    #=> #<MiqUserRole id: 2, name: "EvmRole-administrator", read_only: true, created_at: "2020-07-20 15:32:43", updated_at: "2020-07-20 15:32:43", settings: nil>
    irb(main):006:0> _.feature_identifiers
    #=> ["add_global_filter", "vm_explorer", "vm_infra_explorer", ... ]

    Where the SELECT DISTINCT is unavoidable since it is part of the ActiveRecrod when doing an eager load that requires a deeply nested joins.

    https://github.com/rails/rails/blob/92fd1c202bc6cdb945c110b23f9879de0fbd9b0e/activerecord/lib/active_record/relation/finder_methods.rb#L389-L395

    So I set out to try a few things.

    using a scope

    So slightly changing the above code to move it into a scope actually works just fine:

    class MiqUserRole < ActiveRecord::Base
      scope :with_feature_identifiers, -> {
        left_joins(:miq_product_features)
          .select(MiqUserRole.arel_table[Arel.star])
          .select('array_agg(miq_product_features.identifier) AS feature_identifiers')
          .group(MiqUserRole.arel_table[:id])
      }
    
      # ...
    end
    
    MiqUserRole.with_feature_identifiers.find_by_id(2)

    However, my goal is to make this accessible by eager loading from User, so this really isn't solving my problem. Though still neat I guess.

    using a virtual_attribute

    This has the same problem as the above, and requires a crap ton more work to get the :type working as well:

    class MiqUserRole < ActiveRecord::Base
      virtual_attribute :feature_identifiers,
                        ActiveRecord::ConnectionAdapters::PostgreSQL::OID::Array.new(ActiveRecord::Type.lookup(:string)),
                        :arel => (lambda do |t|
        product_features  = MiqProductFeature.arel_table
        miq_role_features = Arel::Table.new(:miq_roles_features)
        array_agg         = Arel::Nodes::NamedFunction.new("array_agg", [product_features[:identifier]])
    
        t.grouping(
          product_features.project(array_agg)
                          .join(miq_role_features, Arel::Nodes::OuterJoin)
                            .on(miq_role_features[:miq_product_feature_id].eq(product_features[:id]))
                          .where(miq_role_features[:miq_user_role_id].eq(t[:id]))
                          .group(miq_role_features[:miq_user_role_id])
        )
      end)
    
      # ...
    end
    
    MiqUserRole.eager_load(:miq_product_features).find_by_name("EvmRole-administrator")

    But technically could be modified to work with the User model directly, though that would probably require more changes overall to the code base to make that work in the spots I need it to (for my purposes).

    Nick LaMuro
    @NickLaMuro
    (told you it would be long :bow: )
    Keenan Brock
    @kbrock

    Very nice. and not too "clever"

    (Obviously it is nice. That is why my brain is running wild with all the possibilities. so sorry to chime in with "you should's" - it is my way of saying this is great and we should use it in more than this one place)

    1. wonder if you can use the :with_feature_identifiers AR instead of arel in :feature_identifiers
    2. Can you use :array_set or something similar? Maybe we can define a string array and integer array to make using this easier.
    3. Using array_agg is a great idea, along with unnest which is the reverse. They are tricky for me and requires a lot of concentration.
    4. I do wish we used eager_load and preload vs the whole includes().references() stuff - but you remember we had that pr and the gotchas in the end forced an abort.
    5. I do think we could create a virtual_array_attribute that would do most of this work. (yes, going all meta on you) - this code looks like a pattern that is very reproducible
    Keenan Brock
    @kbrock

    you got to love the performance conundrum

    ms query qry ms rows comments
    6,630.5 14 4,731.1 50 /vm_infra/report_data#before
    970.5 173 196.3 169 /vm_infra/report_data#after

    ms is much bigger bofore, but number of queries is much larger after...

    obviously I'm going to track this down and hope for the best of both worlds, but it is fun to see

    Joe Rafaniello
    @jrafanie
    "fun to see" <---- :laughing:
    good luck!
    Keenan Brock
    @kbrock
    of course I want the best of both worlds... but don't we all
    thnx
    Joe Rafaniello
    @jrafanie
    make sure you're getting the same results... I've done silly things where I assumed I was measuring the same thing but I changed the behavior of the thing so the result was different :laughing:
    Keenan Brock
    @kbrock
    haven't we all
    I've narrowed it down to 1 commit. with 5 files changed. should be straight forward
    famous last words...
    Jason Frey
    @Fryguy
    Add some latency and then compare also 😉
    Keenan Brock
    @kbrock
    heh

    h-kataria/manageiq-ui-classic#1

    ms query qry ms rows comments
    6,630.5 14 4,731.1 50 before 7189
    970.5 173 196.3 169 after 7189
    569.8 13 54.5 49 after this pr
    41.3% 92% 72.2% 71% improvement

    don't need latency to see all the numbers getting better

    I know the pr spiked the queries, but all in all, that pr is a great improvement
    this just nudges it in the right direction
    Jason Frey
    @Fryguy
    :tada:
    Peter McGowan
    @pemcg
    Hi all, what tool do we use for simulating large VMware environments? I’ve been Googling things like vcsim and govcsim which seem quite cool. I need to fire up a multi 1000 VM vCenter env
    Adam Grare
    @agrare
    Hey @pemcg we use the 5.5 vcsim for our vmware scale testing but I've been meaning to try out govcsim for a long time
    I have a few different sized vcsims, PM me if you want to use one
    it looks pretty cool

    I have a few different sized vcsims, PM me if you want to use one

    Thanks

    Dennis Metzger
    @dmetzger57
    Thanks for that pointer @pemcg
    Peter McGowan
    @pemcg
    np @dmetzger57
    Keenan Brock
    @kbrock
    I had tried the go simulator a while back and it was missing a lot of the data
    But the striking thing about the go simulator is that it seems to be part of a rewrite of the engine and isn’t just some bogus simulator.
    The cvs I’m is a real environment but is just so big
    Ugh. The vcsim is bloated but uses the real code, which is good. Think it is deprecated
    I need to try out the go one again. It had so much promise and good VMware backing
    Keenan Brock
    @kbrock
    ^ ignore. Very excited to hear your experiences with the new go sim.
    Peter McGowan
    @pemcg

    ^ ignore. Very excited to hear your experiences with the new go sim.

    Well I had no experience with the old one, so seeing 2 emulated VMs is exciting for me :smile:

    Adam Grare
    @agrare
    yeah depends on what you want to test, the nice thing about the old vcsim is that it is a real vcenter so it will exhibit the same behavior as you'll see in a prod env
    so if you just want to see "what does 1000 vms look like" I'm sure govcsim is fine, if you want to see how your code will work on a real vcenter with 1000 vms I don't thing govcsim will do that
    Peter McGowan
    @pemcg
    ok that’s useful to know, thanks. For now I just want quantity of VMs, hosts, clusters etc to get an idea of timings and memory usage
    Keenan Brock
    @kbrock
    the basics in the go were working even a while back. I was concentrating on metrics at the time and the go api just wasn't there yet
    it sounded like the go simulator was possibly a front end for some future vcenter work - wonder if they are thinking about doing a major revamp - at least of the soap front end. Which makes sense. Quite a few vendors rewrite the web service layer in go. well in the ruby world at least.