by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Jan 31 2019 17:58
    jorgemachucav starred galaxyproject/tools-iuc
  • Jan 31 2019 17:45
    bebatut opened #2270
  • Jan 31 2019 16:18
    cpreviti synchronize #2267
  • Jan 31 2019 14:15
    cpreviti synchronize #2267
  • Jan 31 2019 12:42
    bernt-matthias review_requested #2269
  • Jan 31 2019 12:42
    bernt-matthias edited #2269
  • Jan 31 2019 12:41
    bernt-matthias edited #2269
  • Jan 31 2019 12:40
    bernt-matthias synchronize #2269
  • Jan 31 2019 12:13
    cpreviti commented #2267
  • Jan 31 2019 12:07
    nsoranzo commented #2267
  • Jan 31 2019 12:01
    cpreviti synchronize #2267
  • Jan 31 2019 11:21
    cpreviti synchronize #2267
  • Jan 31 2019 09:47
    cpreviti synchronize #2267
  • Jan 31 2019 09:27
    cpreviti synchronize #2267
  • Jan 30 2019 20:38
    bernt-matthias commented #2131
  • Jan 30 2019 20:19
    hepcat72 commented #2239
  • Jan 30 2019 19:50
    lparsons commented #2239
  • Jan 30 2019 18:36
    bgruening commented #2268
  • Jan 30 2019 15:23
    nsoranzo commented #2268
  • Jan 30 2019 15:23
    nsoranzo commented #2267
Marius van den Beek
@mvdbeek
you mean in the tool ?
I was talking about the history
M Bernt
@bernt-matthias
Yes - in the tool command sections
Marius van den Beek
@mvdbeek
the tool should be as complex as necessary
the history should be as simple as possible IMO
though heterogenous collections are a pain as well
as you would create there
hmm, not sure we have a great solution there
I guess discovering the output and sniffing could work ?
but if you know the formats it’s probably best to rename as the sniffers aren’t perfect
M Bernt
@bernt-matthias
Most of the time there is one output (and potentially some optional ones) .. so the user can decide how much mess is wanted/needed .. the outputs are differentiated by name.
Marius van den Beek
@mvdbeek
but if you just have one output why would you want to use inherit_format ?
do you have the tool somewhere ?
M Bernt
@bernt-matthias

Sniffing might be an option .. and appealingly simple .. but there are some very under specified data types used in OpenMS for which sniffing does not work.

I could post the a ctd file that I try to autotranslate.

In the end I just want to set the format of the output and I know that it is the same as the input, but there are multiple possible input formats.
Marius van den Beek
@mvdbeek
how do you derive the format from multiple inputs ?
I am still lost here. This should work fine with collection input if you set inherit_format. If you use multiple=“true” you need to create an arbitrary number of outputs using discover_datasets. This will be difficult to impossible to use in workflows
M Bernt
@bernt-matthias
Good points. What is the limiting point for the usability in workflows, i.e. in which case(s) multiple=“true” inputs do not work in work flows?
Marius van den Beek
@mvdbeek
oh, that works fine, discovered outputs aren’t usable
there’s no output to attach them to
there is a single, primary dataset, but I guess that won’t be of much/any use
M Bernt
@bernt-matthias

So if you have data with discover_outputs (which I would understand) or also if you have a collection with discover_outputs? I will always have the later case.

... but I have to admit that I have a few cases where I abuses data+discover to set the format, but its always just a single data set .. so in this case the single data set is of use :) Nice

Marius van den Beek
@mvdbeek
yeah, if the discovered outputs are inside a collection it’s fine, you’ll have the collection output to work with
John Chilton
@jmchilton
I’m not following the conversation completely - but if the problem is just extensions not matching you can probably fix it with setting up a galaxy.json file
M Bernt
@bernt-matthias
Oha .. something new to learn .. is there something to read?
Sounds better than looking at the line of code that might do the workaround:
preprocessing += "${ ' && '.join([ \"ln -s '%s' '"+actual_parameter+"/%s.%s'\" % (_, re.sub('[^\w\-_]', '_', _.element_identifier), _.ext) for _ in $" + actual_parameter + " if _ ]) } && \n"
looks like perl in terms of readability .. lol
Yvan Le Bras
@yvanlebras
;)
M Bernt
@bernt-matthias

Lets assume I have a select parameter with multiple="true". For a test can I somehow express a value that contains a comma?

  • for options without comma (e.g. "Option A" and "Option B") it would be value="Option A,Option B"
  • but what if the option value contains a comma (e.g. "Option,A" and "Option,B")

Can one have something like

<param>
  <value="Option,A"/> 
  <value="Option,B"/>
</param>
Björn Grüning
@bgruening
@bernt-matthias you should get an iterator, isn't it? And then you can iterate over it. Only if you cast it to a string its a comma seprated list, isn't it?
M Bernt
@bernt-matthias
@bgruening using this in the command section isn't the problem, so your statement is 100% correct. My question is how to formulate a test for this.
Björn Grüning
@bgruening
Oh, sorry, missed that. Is "'option,a','option,b''" not working?
M Bernt
@bernt-matthias
Will try this. Thanks
John Chilton
@jmchilton
Can you put the comma in the label and not the value - that is going to break things
M Bernt
@bernt-matthias

yep:

    <inputs>
        <param name="sel" type="select" multiple="true" label="sel">
            <option value="o 1"></option>
            <option value="o 2"></option>
            <option value="o,1"></option>
            <option value="o,2"></option>
        </param>
    </inputs>
    <outputs>
        <data format="txt" name="sample"/>
    </outputs>
    <tests>
        <test>
            <param name="sel" value="o 1,o 2"/>
            <data format="txt" name="sample"/>

        </test>
        <test>
            <param name="sel" value="'o,1','o,2'"/>
            <data format="txt" name="sample"/>
        </test>
    </tests>

gives:

RunToolException: Error creating a job for these tool inputs - parameter 'sel': an invalid option (u"'o") was selected (valid options: o,1,o,2,o 1,o 2)

I guess implementing something that allows something like

        <test>
            <param name="sel" >
                <value>o,1</value>
                <value>o,2</value>
             </param>
            <data format="txt" name="sample"/>
        </test>

within galaxyproject/galaxy#9079 might be an option.

An alternative might be a sanitizer+mapping and to use the mapped value in the test, right?

Björn Grüning
@bgruening
@bernt-matthias you can alway put something like o_1 in the value and handle the substitution elsewhere
M Bernt
@bernt-matthias
You are right, but in the context of my "little" CTD -> tool converter .. its a wee bit more tricky :)
Marius van den Beek
@mvdbeek
There’s a new samtools version out with lots of nice new things: https://github.com/samtools/samtools/releases/tag/1.10
Can write index files during sort, sam.gz seems recommended for huge contigs, no longer considers a zero-length file to be a valid SAM file,
Nicola Soranzo
@nsoranzo
Yeah, I saw the release notes.
Marius van den Beek
@mvdbeek
Screenshot 2019-12-19 at 16.50.44.png
Making good progress:
list of failing test is at https://mvdbeek.github.io/iuc-tool-test-results/, as usual
we also have some regressions (this is testing against the current dev, which will probably become 20.01 pretty soon), so that’s good to see as well
Dan Fornika
@dfornika
Does anyone know of a simple (one-step) way to concatenate tabular files that contain a header? (while preserving a single header line at the top but removing all others)
Dan Fornika
@dfornika
Ahhh..... I've seen that one before but forgot about it. Thanks!
Björn Grüning
@bgruening
Can we not extend the cut tool to ignore headers?
Dan Fornika
@dfornika
Which tool exactly? Is it a shed tool, or is it built-in to the main galaxy codebase?