Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Daniel Berger
    @djberg96
    anyway, i was just saying i would have to fiddle with postgresql, parent and child tables, and cascading deletes
    Keenan Brock
    @kbrock
    well, we insert into metrics, but a trigger deletes from metrics and inserts into the approperiate metrics_*
    Daniel Berger
    @djberg96
    and we do that because....? i can't remember any more
    Keenan Brock
    @kbrock
    I've given up on that question
    Daniel Berger
    @djberg96
    sorry :(
    Keenan Brock
    @kbrock
    sharding is webscale
    it was either that or switching over to mongo (or /dev/null)
    Daniel Berger
    @djberg96
    or oracle!
    :-P
    alright, time for dinner, gitter is dropping messages now, cya later
    Jason Frey
    @Fryguy
    @kbrock Reading back I don't understand where you are getting that expanded query from since the limit is not in the AR query
    If I recall, PG is smart enough to not even consider the subtables that don't fit into the time range, so it's not 24 queries... Unless something had changed
    Keenan Brock
    @kbrock

    @Fryguy thnx - it should be:

    Metric.where(:id => Metric.where('timestamp < ?', 4.hours.ago).limit(100)).delete_all

    otherwise we'd just use:

    Metric.where('timestamp < ?', 4.hours.ago)).delete_all

    also, we use arel/config values in the actual generation there. I simplified for the room

    Chris Arcand
    @chrisarcand
    @kbrock Answering your original question, yes...? There's nothing stopping you from removing the trigger and writing things in Ruby/SQL to do the sharding and fine tune deleting things per table shard instead of leaving it to PG, if that really is a pain point. I thought that was already one of your known goals.
    I mean I don't know if that's a good idea or if you'll benefit from it, but I don't see why it's not possible.
    Keenan Brock
    @kbrock
    @chrisarcand ooh - I was asking if it were possible to Metric.with_table(:metrics_2).create(:timestamp => xxx)
    wasn't looking at modifying our sharding technique (yes I don't like it but... seems to work well enough)
    Jason Frey
    @Fryguy
    Ah, I see what you're trying to do now... I like that with table idea
    Keenan Brock
    @kbrock
    well, that is the insert version. wanted to use it for delete version
    (I assumed the research was more of an insert mode rather than delete)
    @Fryguy should we just implement a truncate strategy instead of our current delete with batches mode?
    not ready for it with MetricRollup since that has a dependant table, but Metric / metrics seems ripe for the picking
    I just don't know how to say to ActiveRecord, for this one query, could you change the query. I think tablename can be a lambda. maybe that will be it
    Jason Frey
    @Fryguy
    I did it a while ago, but can't remember how
    But yeah, truncate strategy is good... Just have to be careful if someone sets the config higher than the number of tables... Maybe we prevent that for realtime
    Keenan Brock
    @kbrock
    @Fryguy I did post the current metrics deletes [ref again] -- it is very impressive
    Jason Frey
    @Fryguy
    Is it possible to run again with Explain Analyze and some data?
    Chris Arcand
    @chrisarcand

    You'd have to make your own with_table and just set the table metadata on the model if you want that sort of thing over just using arel. Rails doesn't provide any sort of dynamic 'use this table name for this query' that I'm aware of.

    def self.with_table(table_name)
      # Save the old table name somehow, whatever...
      @old_table_name = self.table_name
    
      begin
        self.table_name = table_name
        yield
      ensure
         self.table_name = @old_table_name
      end
    end

    ^ totally making this up, haven't tried it.

      Metric.with_table(:metrics_2) do
        create(:timestamp => blah)
      end
    maybe?
    Jason Frey
    @Fryguy
    If I recall, that's similar to what I had done on the past
    Chris Arcand
    @chrisarcand
    Who knows how that screws up the model's state, too, you might have to reset_column_stuff whatever that method is called at the end.
    Jason Frey
    @Fryguy
    That should be ok because the child tables have the exact same structure
    Keenan Brock
    @kbrock
    @Fryguy I have a table with data - but it takes > 4 minutes to run
    Jason Frey
    @Fryguy
    the explain analyze does?
    Keenan Brock
    @kbrock
    yup
    Jason Frey
    @Fryguy
    cool
    throw it in here when it's done: https://explain.depesz.com/
    Keenan Brock
    @kbrock
    @chrisarcand yea. well, you used to be able to set table_name = proc.new {}
    Jason Frey
    @Fryguy
    blog post with more info on the second link: http://tatiyants.com/postgres-query-plan-visualization/
    Chris Arcand
    @chrisarcand
    can you not anymore?
    Keenan Brock
    @kbrock
    @chrisarcand that does work (I assume)
    would be cool if the query came into the lambda :)
    Keenan Brock
    @kbrock
    @Fryguy I ran the query with data. tried sending it to the service. the json is 23,332 lines - they can't handle it
    I'll try again with metrics_21 in the outer query.
    Jason Frey
    @Fryguy
    gist?
    Keenan Brock
    @kbrock
    the tried and true one still rocks: https://explain.depesz.com/s/UuPX ( the newer one is pretty, but doen't get me what I want) http://tatiyants.com/pev/#/plans/plan_1518804077072
    smaller metrics_21 gist
    the larger one metrics gist is practically unresponsive