Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Jan 31 2019 17:58
    jorgemachucav starred galaxyproject/tools-iuc
  • Jan 31 2019 17:45
    bebatut opened #2270
  • Jan 31 2019 16:18
    cpreviti synchronize #2267
  • Jan 31 2019 14:15
    cpreviti synchronize #2267
  • Jan 31 2019 12:42
    bernt-matthias review_requested #2269
  • Jan 31 2019 12:42
    bernt-matthias edited #2269
  • Jan 31 2019 12:41
    bernt-matthias edited #2269
  • Jan 31 2019 12:40
    bernt-matthias synchronize #2269
  • Jan 31 2019 12:13
    cpreviti commented #2267
  • Jan 31 2019 12:07
    nsoranzo commented #2267
  • Jan 31 2019 12:01
    cpreviti synchronize #2267
  • Jan 31 2019 11:21
    cpreviti synchronize #2267
  • Jan 31 2019 09:47
    cpreviti synchronize #2267
  • Jan 31 2019 09:27
    cpreviti synchronize #2267
  • Jan 30 2019 20:38
    bernt-matthias commented #2131
  • Jan 30 2019 20:19
    hepcat72 commented #2239
  • Jan 30 2019 19:50
    lparsons commented #2239
  • Jan 30 2019 18:36
    bgruening commented #2268
  • Jan 30 2019 15:23
    nsoranzo commented #2268
  • Jan 30 2019 15:23
    nsoranzo commented #2267
Marius van den Beek
@mvdbeek
do you have the tool somewhere ?
M Bernt
@bernt-matthias

Sniffing might be an option .. and appealingly simple .. but there are some very under specified data types used in OpenMS for which sniffing does not work.

I could post the a ctd file that I try to autotranslate.

In the end I just want to set the format of the output and I know that it is the same as the input, but there are multiple possible input formats.
Marius van den Beek
@mvdbeek
how do you derive the format from multiple inputs ?
I am still lost here. This should work fine with collection input if you set inherit_format. If you use multiple=“true” you need to create an arbitrary number of outputs using discover_datasets. This will be difficult to impossible to use in workflows
M Bernt
@bernt-matthias
Good points. What is the limiting point for the usability in workflows, i.e. in which case(s) multiple=“true” inputs do not work in work flows?
Marius van den Beek
@mvdbeek
oh, that works fine, discovered outputs aren’t usable
there’s no output to attach them to
there is a single, primary dataset, but I guess that won’t be of much/any use
M Bernt
@bernt-matthias

So if you have data with discover_outputs (which I would understand) or also if you have a collection with discover_outputs? I will always have the later case.

... but I have to admit that I have a few cases where I abuses data+discover to set the format, but its always just a single data set .. so in this case the single data set is of use :) Nice

Marius van den Beek
@mvdbeek
yeah, if the discovered outputs are inside a collection it’s fine, you’ll have the collection output to work with
John Chilton
@jmchilton
I’m not following the conversation completely - but if the problem is just extensions not matching you can probably fix it with setting up a galaxy.json file
M Bernt
@bernt-matthias
Oha .. something new to learn .. is there something to read?
Sounds better than looking at the line of code that might do the workaround:
preprocessing += "${ ' && '.join([ \"ln -s '%s' '"+actual_parameter+"/%s.%s'\" % (_, re.sub('[^\w\-_]', '_', _.element_identifier), _.ext) for _ in $" + actual_parameter + " if _ ]) } && \n"
looks like perl in terms of readability .. lol
Yvan Le Bras
@yvanlebras
;)
M Bernt
@bernt-matthias

Lets assume I have a select parameter with multiple="true". For a test can I somehow express a value that contains a comma?

  • for options without comma (e.g. "Option A" and "Option B") it would be value="Option A,Option B"
  • but what if the option value contains a comma (e.g. "Option,A" and "Option,B")

Can one have something like

<param>
  <value="Option,A"/> 
  <value="Option,B"/>
</param>
Björn Grüning
@bgruening
@bernt-matthias you should get an iterator, isn't it? And then you can iterate over it. Only if you cast it to a string its a comma seprated list, isn't it?
M Bernt
@bernt-matthias
@bgruening using this in the command section isn't the problem, so your statement is 100% correct. My question is how to formulate a test for this.
Björn Grüning
@bgruening
Oh, sorry, missed that. Is "'option,a','option,b''" not working?
M Bernt
@bernt-matthias
Will try this. Thanks
John Chilton
@jmchilton
Can you put the comma in the label and not the value - that is going to break things
M Bernt
@bernt-matthias

yep:

    <inputs>
        <param name="sel" type="select" multiple="true" label="sel">
            <option value="o 1"></option>
            <option value="o 2"></option>
            <option value="o,1"></option>
            <option value="o,2"></option>
        </param>
    </inputs>
    <outputs>
        <data format="txt" name="sample"/>
    </outputs>
    <tests>
        <test>
            <param name="sel" value="o 1,o 2"/>
            <data format="txt" name="sample"/>

        </test>
        <test>
            <param name="sel" value="'o,1','o,2'"/>
            <data format="txt" name="sample"/>
        </test>
    </tests>

gives:

RunToolException: Error creating a job for these tool inputs - parameter 'sel': an invalid option (u"'o") was selected (valid options: o,1,o,2,o 1,o 2)

I guess implementing something that allows something like

        <test>
            <param name="sel" >
                <value>o,1</value>
                <value>o,2</value>
             </param>
            <data format="txt" name="sample"/>
        </test>

within galaxyproject/galaxy#9079 might be an option.

An alternative might be a sanitizer+mapping and to use the mapped value in the test, right?

Björn Grüning
@bgruening
@bernt-matthias you can alway put something like o_1 in the value and handle the substitution elsewhere
M Bernt
@bernt-matthias
You are right, but in the context of my "little" CTD -> tool converter .. its a wee bit more tricky :)
Marius van den Beek
@mvdbeek
There’s a new samtools version out with lots of nice new things: https://github.com/samtools/samtools/releases/tag/1.10
Can write index files during sort, sam.gz seems recommended for huge contigs, no longer considers a zero-length file to be a valid SAM file,
Nicola Soranzo
@nsoranzo
Yeah, I saw the release notes.
Marius van den Beek
@mvdbeek
Screenshot 2019-12-19 at 16.50.44.png
Making good progress:
list of failing test is at https://mvdbeek.github.io/iuc-tool-test-results/, as usual
we also have some regressions (this is testing against the current dev, which will probably become 20.01 pretty soon), so that’s good to see as well
Dan Fornika
@dfornika
Does anyone know of a simple (one-step) way to concatenate tabular files that contain a header? (while preserving a single header line at the top but removing all others)
Dan Fornika
@dfornika
Ahhh..... I've seen that one before but forgot about it. Thanks!
Björn Grüning
@bgruening
Can we not extend the cut tool to ignore headers?
Dan Fornika
@dfornika
Which tool exactly? Is it a shed tool, or is it built-in to the main galaxy codebase?
Dan Fornika
@dfornika
Could I request a review on this PR that extends the tetyper tool to allow users to load variant profiles from tool data tables? galaxyproject/tools-iuc#2772
Lucille Delisle
@lldelisle
Dear iuc,
Is there a galaxy datatype that would be close to bedpe: https://bedtools.readthedocs.io/en/latest/content/general-usage.html#bedpe-format
Lucille Delisle
@lldelisle
Thx @bernt-matthias
Dan Fornika
@dfornika
Sorry to ask again, but could someone please review this PR that adds support to tetyper for loading variant profiles from tool data tables? Thanks! galaxyproject/tools-iuc#2772
Björn Grüning
@bgruening
on it @dfornika !
Greg Von Kuster
@gregvonkuster

I'm working with a lab doing research with bacterial genomes and I'm working on introducing their pipeline into Galaxy. In many cases the bacterial data consists of a combination of species, so mapping to a specific genome is tricky.

Mapping bacterial data to a referencce genome has 3 options:

  1. the genome is known, so is selected from local cache as is often done in Galaxy

  2. a custom reference is selected - these custom references are a composition of files, and I'm not aware if/how this is currently handled in Galaxy. For example, for the custom reference for Brucella_abortus1 consists of the following 7 files:

    • Bab1_define_filter.xlsx
    • Bab1_remove_from_analysis.xlsx
    • NC_006932-NC_006933.fasta
    • NC_006932-NC_006933.gbk
    • NC_006932-NC_006933.genome
    • NC_006932-NC_006933.gff
  3. The input data is inspected to determine the optimal genome to be mapped.

Does anyone know if this is currently being done in Galaxy? I've done some searching and only found this so far https://help.galaxyproject.org/t/mapping-rna-seq-data-to-a-composite-build-of-bacterial-genomes/1443.

I can wrap the tool for Galaxy that the lab is using, but I want to make sure I'm not re-inventing the wheel.

Jennifer Hillman-Jackson
@jennaj
Could someone who knows more about the status of GATK4 wrapper dev reply to this post? I see a few in done, some in progress, but didn't find a master ticket summary -- maybe missed it? Thx! https://help.galaxyproject.org/t/galaxy-tool-wrappers-for-gatk-4/2798
M Bernt
@bernt-matthias
Do you think its good practice to list the allowed data types in a data input in the label or the help?
e.g. select tsv data sets(s)
Nicola Soranzo
@nsoranzo
:+1:
M Bernt
@bernt-matthias
Maybe the Galaxy web UI should just do this? similar to -argument.