by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Jan 31 2019 17:58
    jorgemachucav starred galaxyproject/tools-iuc
  • Jan 31 2019 17:45
    bebatut opened #2270
  • Jan 31 2019 16:18
    cpreviti synchronize #2267
  • Jan 31 2019 14:15
    cpreviti synchronize #2267
  • Jan 31 2019 12:42
    bernt-matthias review_requested #2269
  • Jan 31 2019 12:42
    bernt-matthias edited #2269
  • Jan 31 2019 12:41
    bernt-matthias edited #2269
  • Jan 31 2019 12:40
    bernt-matthias synchronize #2269
  • Jan 31 2019 12:13
    cpreviti commented #2267
  • Jan 31 2019 12:07
    nsoranzo commented #2267
  • Jan 31 2019 12:01
    cpreviti synchronize #2267
  • Jan 31 2019 11:21
    cpreviti synchronize #2267
  • Jan 31 2019 09:47
    cpreviti synchronize #2267
  • Jan 31 2019 09:27
    cpreviti synchronize #2267
  • Jan 30 2019 20:38
    bernt-matthias commented #2131
  • Jan 30 2019 20:19
    hepcat72 commented #2239
  • Jan 30 2019 19:50
    lparsons commented #2239
  • Jan 30 2019 18:36
    bgruening commented #2268
  • Jan 30 2019 15:23
    nsoranzo commented #2268
  • Jan 30 2019 15:23
    nsoranzo commented #2267
Marius van den Beek
@mvdbeek
but that’s just a guess, it might also become more messy
M Bernt
@bernt-matthias

Agree completely. Any other idea on how to set the format of the collection elements? In my case it would be nice to discover data sets, but only get the name from this and take the format from the corresponding input.

Background: OpenMS often requires that the output files have specific extensions which are (of course) not equal to the Galaxy extensions. The only idea I had so far is to rename the files after creation, such that discovering works.

Marius van den Beek
@mvdbeek
is this a 1 to 1 relationship ? each input produces one output but you need all the inputs for tool execution ?
M Bernt
@bernt-matthias
Yep, exactly.
From tool programming perspective it seems best (at least in the amount of needed code) to have an input and an output collection. But I'm wondering about the usability if I force the users into forming a collection. Using a multiple="true" input is just more flexible.
Marius van den Beek
@mvdbeek
isn’t this going to be quite messy if you have more than 2 inputs ?
ok, exaggrating here, but I easily get lost when a tool produces more than 5 outputs
this would be less of an issue if you produce a collection output
M Bernt
@bernt-matthias
I just create a folder for each input and output for storing files and links.
Marius van den Beek
@mvdbeek
you mean in the tool ?
I was talking about the history
M Bernt
@bernt-matthias
Yes - in the tool command sections
Marius van den Beek
@mvdbeek
the tool should be as complex as necessary
the history should be as simple as possible IMO
though heterogenous collections are a pain as well
as you would create there
hmm, not sure we have a great solution there
I guess discovering the output and sniffing could work ?
but if you know the formats it’s probably best to rename as the sniffers aren’t perfect
M Bernt
@bernt-matthias
Most of the time there is one output (and potentially some optional ones) .. so the user can decide how much mess is wanted/needed .. the outputs are differentiated by name.
Marius van den Beek
@mvdbeek
but if you just have one output why would you want to use inherit_format ?
do you have the tool somewhere ?
M Bernt
@bernt-matthias

Sniffing might be an option .. and appealingly simple .. but there are some very under specified data types used in OpenMS for which sniffing does not work.

I could post the a ctd file that I try to autotranslate.

In the end I just want to set the format of the output and I know that it is the same as the input, but there are multiple possible input formats.
Marius van den Beek
@mvdbeek
how do you derive the format from multiple inputs ?
I am still lost here. This should work fine with collection input if you set inherit_format. If you use multiple=“true” you need to create an arbitrary number of outputs using discover_datasets. This will be difficult to impossible to use in workflows
M Bernt
@bernt-matthias
Good points. What is the limiting point for the usability in workflows, i.e. in which case(s) multiple=“true” inputs do not work in work flows?
Marius van den Beek
@mvdbeek
oh, that works fine, discovered outputs aren’t usable
there’s no output to attach them to
there is a single, primary dataset, but I guess that won’t be of much/any use
M Bernt
@bernt-matthias

So if you have data with discover_outputs (which I would understand) or also if you have a collection with discover_outputs? I will always have the later case.

... but I have to admit that I have a few cases where I abuses data+discover to set the format, but its always just a single data set .. so in this case the single data set is of use :) Nice

Marius van den Beek
@mvdbeek
yeah, if the discovered outputs are inside a collection it’s fine, you’ll have the collection output to work with
John Chilton
@jmchilton
I’m not following the conversation completely - but if the problem is just extensions not matching you can probably fix it with setting up a galaxy.json file
M Bernt
@bernt-matthias
Oha .. something new to learn .. is there something to read?
Sounds better than looking at the line of code that might do the workaround:
preprocessing += "${ ' && '.join([ \"ln -s '%s' '"+actual_parameter+"/%s.%s'\" % (_, re.sub('[^\w\-_]', '_', _.element_identifier), _.ext) for _ in $" + actual_parameter + " if _ ]) } && \n"
looks like perl in terms of readability .. lol
Yvan Le Bras
@yvanlebras
;)
M Bernt
@bernt-matthias

Lets assume I have a select parameter with multiple="true". For a test can I somehow express a value that contains a comma?

  • for options without comma (e.g. "Option A" and "Option B") it would be value="Option A,Option B"
  • but what if the option value contains a comma (e.g. "Option,A" and "Option,B")

Can one have something like

<param>
  <value="Option,A"/> 
  <value="Option,B"/>
</param>
Björn Grüning
@bgruening
@bernt-matthias you should get an iterator, isn't it? And then you can iterate over it. Only if you cast it to a string its a comma seprated list, isn't it?
M Bernt
@bernt-matthias
@bgruening using this in the command section isn't the problem, so your statement is 100% correct. My question is how to formulate a test for this.
Björn Grüning
@bgruening
Oh, sorry, missed that. Is "'option,a','option,b''" not working?
M Bernt
@bernt-matthias
Will try this. Thanks
John Chilton
@jmchilton
Can you put the comma in the label and not the value - that is going to break things
M Bernt
@bernt-matthias

yep:

    <inputs>
        <param name="sel" type="select" multiple="true" label="sel">
            <option value="o 1"></option>
            <option value="o 2"></option>
            <option value="o,1"></option>
            <option value="o,2"></option>
        </param>
    </inputs>
    <outputs>
        <data format="txt" name="sample"/>
    </outputs>
    <tests>
        <test>
            <param name="sel" value="o 1,o 2"/>
            <data format="txt" name="sample"/>

        </test>
        <test>
            <param name="sel" value="'o,1','o,2'"/>
            <data format="txt" name="sample"/>
        </test>
    </tests>

gives:

RunToolException: Error creating a job for these tool inputs - parameter 'sel': an invalid option (u"'o") was selected (valid options: o,1,o,2,o 1,o 2)

I guess implementing something that allows something like

        <test>
            <param name="sel" >
                <value>o,1</value>
                <value>o,2</value>
             </param>
            <data format="txt" name="sample"/>
        </test>

within galaxyproject/galaxy#9079 might be an option.

An alternative might be a sanitizer+mapping and to use the mapped value in the test, right?

Björn Grüning
@bgruening
@bernt-matthias you can alway put something like o_1 in the value and handle the substitution elsewhere
M Bernt
@bernt-matthias
You are right, but in the context of my "little" CTD -> tool converter .. its a wee bit more tricky :)
Marius van den Beek
@mvdbeek
There’s a new samtools version out with lots of nice new things: https://github.com/samtools/samtools/releases/tag/1.10
Can write index files during sort, sam.gz seems recommended for huge contigs, no longer considers a zero-length file to be a valid SAM file,
Nicola Soranzo
@nsoranzo
Yeah, I saw the release notes.