Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Bram van Dijk
    @bramvandijk88_twitter
    Is this a spades-specific thing?
    Björn Grüning
    @bgruening
    @bramvandijk88_twitter yes Spades are requesting a lot of memory.
    @arcari_gabriele_twitter @eschen42 I try to answer during the day. But I'm currently swamped ... sorry
    Bram van Dijk
    @bramvandijk88_twitter

    @bramvandijk88_twitter yes Spades are requesting a lot of memory.

    Alright. Are there subscription services available to prioritise things?

    Helena Rasche
    @hexylena
    The subscription/paid option is currently: setup your own galaxy, e.g. using cloudman. (Judging by your name, SURFSara.nl might be an option for you if you're in NL. They have a galaxy environment.)
    Bram van Dijk
    @bramvandijk88_twitter
    Well, I'm in germany right now, but no worries. So the take home message is: there are services that provide this, but not here?
    Helena Rasche
    @hexylena
    There are not services that provide a "fast lane" in any public galaxy that I'm aware of. If you want such a fast lane, you generally need to setup, host, manage it yourself on some infrastructure you have access to.
    Bram van Dijk
    @bramvandijk88_twitter
    Alright, that's always an option too. Thanks
    Helena Rasche
    @hexylena
    I believe there is work being done to allow attaching your own compute, but it will be some years? before this is generally available.
    Bram van Dijk
    @bramvandijk88_twitter
    So, does it estimate beforehand how much memory I will use, or is spades just very lowly prioritised?
    Because I'm assembling relatively simple things...
    Helena Rasche
    @hexylena

    we need to schedule the job, so spades requests a set amount of memory before it can be scheduled.

    We do not have a good, accurate model yet for how much memory spades uses based on input settings, so, we set it to a high number that can accommodate the full range of assemblies being conducted on EU.

    Unfortunately that means for even small assemblies, spades is a "big" job that is hard to find space for in our a queue.

    https://github.com/usegalaxy-eu/infrastructure-playbook/blob/264a20ab8e3136ffc0b6c259f64396cc94b917f9/files/galaxy/dynamic_rules/usegalaxy/tool_destinations.yaml#L961 here is our file which defines how much memory / how many CPU cores each tool gets, and spades requests 20 cores and 400 GB of ram. These machines are in high demand for similar jobs and so it takes some time unfortunately :(
    EU is working towards modelling tools based on input sizes to guess at how much memory will be needed, but that work is some months away I think.
    Bram van Dijk
    @bramvandijk88_twitter
    Hmmm, well that's unfortunate. Megahit works too, but it assembles these things a bit less well.
    Different question: what do the checkboxes in the workflow-nodes actually do?
    Helena Rasche
    @hexylena
    It's a free service, we offer what we can to a wide audience :)
    the checkboxes control if the outputs will be shown following a workflow run, or hidden
    they let you check e.g. your final report only. So when the user runs the workflow, they see the jobs run + disappear, and only the final datasets that are interesting to them are left.
    Bram van Dijk
    @bramvandijk88_twitter

    Ah that's actually great. I couldn't figure this out sorry.

    Yeah, this is great for a free service. I think our institute would be willing to pay for dedicated servers though. You should consider this.

    Björn Grüning
    @bgruening
    @bramvandijk88_twitter we do. Please contact me on this issue.
    For the time being, the only spades jobs that I see queued have been actually resubmitted because they crashed.
    Helena Rasche
    @hexylena
    I guess in general it has been "instutites adding capacity for everyone" and not just specific groups/institutes but maybe EU's policies can be changed.
    Björn Grüning
    @bgruening
    Crash, most likely due to out of memory.
    Bram van Dijk
    @bramvandijk88_twitter

    For the time being, the only spades jobs that I see queued have been actually resubmitted because they crashed.

    Yeah, I cancelled the ones that weren't crashing about an hour ago

    Björn Grüning
    @bgruening

    I guess in general it has been "instutites adding capacity for everyone"

    That is still true.

    10 Spades jobs are currently running, requesting each 400GB memory.
    Bram van Dijk
    @bramvandijk88_twitter
    Both could be the case. I'm sure we can make all ships rise with the same tide.
    Oh really?
    I thought I cancelled them
    Also: it should take no more than 2 GB :')
    It's really a small number of reads
    Where can I find these jobs though? Or do you mean there not my jobs?
    they're (sorry, not native :')))
    Björn Grüning
    @bgruening
    Sorry, I meant overall spades jobs
    @bramvandijk88_twitter sorry that we are slow in responding, not a good week here.
    Bram van Dijk
    @bramvandijk88_twitter
    No worries, I totally understand.
    I enjoy the tool, so please know it's deeply appreciated
    (IT being your work XD)
    Arthur Eschenlauer
    @eschen42

    Regarding my Query Tabular issue, I found a workaround: when I rerun it, if I expand the Table Options for each input table when re-invoking the tool then the options are correctly selected. If I don't they seem to be lost more likely than not. See e.g.
    https://usegalaxy.eu/u/eschen42/h/forgetfulquerytabular
    where dataset 7 was created from datasets 1-6. Dataset 8 failed when the "rerun" button was pressed followed immediately by the Execute button. Using the "circle-i" button, inspection shows that Use first line as column names is set to false for all tables for Dataset 8 whereas it was set to true for all tables for Dataset 7.

    Right now I am thwarted in my effort to set up another galaxy instance to reproduce it there.

    Arthur Eschenlauer
    @eschen42
    I just pulled 20.05 using https://galaxyproject.org/admin/get-galaxy/#cloning-new and don't seem to see the issue yet.
    Nicola Soranzo
    @nsoranzo
    UseGalaxy.eu is using the (still unreleased) 20.09
    Marius van den Beek
    @mvdbeek
    @eschen42 galaxyproject/galaxy#10584 should fix the problem
    Arthur Eschenlauer
    @eschen42
    I upgraded to release_20.09 as described on https://galaxyproject.org/admin/get-galaxy/#updating-existing and reproduced the problem. Thanks @nsoranzo and @mvdbeek . I will try the PR and let you know how it goes.
    Arthur Eschenlauer
    @eschen42
    @mvdbeek I am delighted to report that after update-existing with mvdbeek:do_render_sections the tool does not fail when I rerun it!
    Marius van den Beek
    @mvdbeek
    cool, thanks for confirming!
    and sorry for breaking it in the first place :/
    Arthur Eschenlauer
    @eschen42
    You are welcome, and I know the feeling...
    Bram van Dijk
    @bramvandijk88_twitter

    I've got a fasta file of 2 lines. A header, and a 3 billion base pair long genome. BWA mem / index somehow trips over this long genome, so I'm thinking it may be resolved if I split is into chunks. I wrote a bash oneliner to do that:

    i=0; while read -n 300000 chars; do i=$((i+1)); printf ">Chunk_${i}\n%s\n" "$chars"; done < myfile.fa

    For as far as I can understand, the faSplit tool does something similar, but when I feed it my file it crashes and states "this list is empty". Help appreciated.

    Björn Grüning
    @bgruening
    @bramvandijk88_twitter try this tool