Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Helena Rasche
    @hexylena:matrix.org
    [m]
    but I'm wondering if we need to index by annotation, indexing by user_id/history_id both make sense.
    I don't know if I've ever seen a way to query history annotations in the UI by text content? Does the history search do that currently?
    ok, no, can confirm that history search does search the annotations.
    should probably be a GIN index
    not a btree
    Helena Rasche
    @hexylena:matrix.org
    [m]
    all the PG text search code I'm looking at uses GINs
    Helena Rasche
    @hexylena:matrix.org
    [m]
    bgruening: dropping and re-creating that index is going to be very expensive if it locks the table, maybe something to try during your downtime? Or just drop it + re-create on the new machine?
    1 reply
    but it's probably over to the sqlalchemy folks to figure out how to do a different index type in postgres only. and how not to make that a migration since locking the history table will kill big servers.
    Helena Rasche
    @hexylena:matrix.org
    [m]
    yeah something like: drop index, (server gets slow for annotation searchers), export, reimport (new machine), re-create the index with type=gin now
    bgruening
    @bgruening:matrix.org
    [m]
    yes, this we can do.
    I assume others will see similar problems when they update to >PG12 and have big instances so, we need to discuss this also on the code level.
    Helena Rasche
    @hexylena:matrix.org
    [m]
    yeah definitely. I'm thinkin btree is the wrong index type here (and for any other columns with type=text) so, definitely going to be something we need to think about how to change
    sj213
    @sj213:matrix.org
    [m]
    CREATE INDEX and REINDEX both support the keyword CONCURRENTLY, which allows for (re-)creating an index w/o locking the pertinent table. This operation will of course run even longer than a idxbuild with a writelock and definitely have an impact on server performance but blocking writes for hours can be avoided this way. (Disclaimer: this is according to the manual, I never tried it myself)
    Helena Rasche
    @hexylena:matrix.org
    [m]
    still, if that's an option that's fantastic
    sj213
    @sj213:matrix.org
    [m]
    Just for fixing ix_history_annotation_association_annotation in our new server instance concurrent idxbuild may not be needed at all as we could do it before putting the box into production. OTOH, I expect the whole process of dump & reload to take around 9h already w/o manual fiddling with any indexes, so depending on how the operation goes schedule-wise I'll still consider it an option :)
    sj213
    @sj213:matrix.org
    [m]
    What's more problematic is that I apparently can't just substitute GIN for BTREE in CREATE INDEX
    When I try "CREATE INDEX ix_history_annotation_association_annotation ON history_annotation_assiciation USING GIN (annotation);" what it get is 'ERROR: data type text has no default operator class for access method "gin"'
    Followed by "HINT: You must specify an operator class for the index or define a default operator class for the data type"
    sj213
    @sj213:matrix.org
    [m]
    The GIN-idx examples in the manual all use functions like to_tsquery() or to_tsvector(). This is frankly all greek to me XD All the text matching I ever did so far in SQL was using LIKE...
    Helena Rasche
    @hexylena:matrix.org
    [m]
    yeah
    https://www.postgresql.org/docs/current/indexes-types.html nothing else there sticks out to me as a better option for text, but, ok, maybe there's some bigger changes involved in switching to that.
    but I guess the short term solution is drop the index and hope no one is searching by history
    sj213
    @sj213:matrix.org
    [m]
    Advanced text matching is, apparently, an advanced topic... ;)
    Helena Rasche
    @hexylena:matrix.org
    [m]
    BRINs aren't right, SP-GiST are spatial, GiST lack equality, Hash only have equals.
    sj213
    @sj213:matrix.org
    [m]
    The index hasn't been created in the first place, so no need to drop it in the new instance. But yes, if we don't find a solution certain querys will probably be slooow... :\
    Or else, unless anything in the current version of Galaxy requires v12-features, we could maybe just try our luck with PG v11 and stick with that until a general solution has emerged
    The new server instance isn't in production yet, so all options are on the table
    Björn Grüning
    @bgruening
    There is a way to check how often this index was used in the past, isn't it?
    can we get this number
    sj213
    @sj213:matrix.org
    [m]
    For "SELECT idx_scan,idx_tup_read,idx_tup_fetch FROM pg_stat_user_indexes WHERE indexrelname='ix_history_annotation_association_annotation';" I get 0,0,0 :)
    for 'ix_history_annotation_association_user_id' the numbers are 8,8,8
    Björn Grüning
    @bgruening
    can you double check this with other indices? @hexylena:matrix.org @natefoo could you see if you get the same results on your instances maybe?
    if its 0, we should just drop it ;)
    also upstream
    sj213
    @sj213:matrix.org
    [m]
    For ..._history_id it is 51227775, 203278, 203001
    So the zeroes are not just zero because no data was collected it would seem
    Nate Coraor
    @natefoo:matrix.org
    [m]
    galaxy_main=> SELECT idx_scan,idx_tup_read,idx_tup_fetch FROM pg_stat_user_indexes WHERE indexrelname='ix_history_annotation_association_annotation';
     idx_scan | idx_tup_read | idx_tup_fetch
    ----------+--------------+---------------
            0 |            0 |             0
    (1 row)
    sj213
    @sj213:matrix.org
    [m]
    Ok, this particular index looks indeed dispensable
    Helena Rasche
    @hexylena:matrix.org
    [m]
    Whoo!
    Björn Grüning
    @bgruening
    here we go, lets get rid of it :)
    is there a way to do this for all indices in a loop @sj213:matrix.org ?
    1 reply
    maybe we can remove more?
    Curtis Ross
    @cross12tamu

    Alright...I can't find where environment variables are declared for Galaxy.

    We're trying to migrate our Apollo services and I cannot make the environment variable change for the life of me. I thought it was in /etc/supervisor/conf.d/<files> but...evidently it isn't picking up. Even after restart.

    Is there another place that environment variables can get declared? Something perhaps that I am missing? At least obvious ones.

    <we don't use ansible atm, we are currently old school/bare metal>

    Nolan Woods
    @innovate-invent
    how is galaxy executed on your server?
    Curtis Ross
    @cross12tamu
    supervisor, but, I think I answered my own question and the env vars in that file are getting picked up, they're just not getting picked up on restart. So, looking into that issue now.
    I should clarify, I didn't answer my question, but, it would appear that the env vars from that file are making their way into galaxy but they're not making it in when modified after restart for some reason. But, I think what I'm trying to figure out now is more of why they don't get picked up on restart compared to is there another place for env vars if that makes sense. Thanks @innovate-invent
    9 replies
    Nolan Woods
    @innovate-invent
    modified after restart?
    how do you mean
    Curtis Ross
    @cross12tamu

    I just meant the modifications to catch after it is restarted.

    supervisorctl restart galaxy: results in the same config as before/after the modifications to the galaxy.conf file. Hopefully that makes sense. Apologies for the confusion.