Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Jan 31 2019 17:58
    jorgemachucav starred galaxyproject/tools-iuc
  • Jan 31 2019 17:45
    bebatut opened #2270
  • Jan 31 2019 16:18
    cpreviti synchronize #2267
  • Jan 31 2019 14:15
    cpreviti synchronize #2267
  • Jan 31 2019 12:42
    bernt-matthias review_requested #2269
  • Jan 31 2019 12:42
    bernt-matthias edited #2269
  • Jan 31 2019 12:41
    bernt-matthias edited #2269
  • Jan 31 2019 12:40
    bernt-matthias synchronize #2269
  • Jan 31 2019 12:13
    cpreviti commented #2267
  • Jan 31 2019 12:07
    nsoranzo commented #2267
  • Jan 31 2019 12:01
    cpreviti synchronize #2267
  • Jan 31 2019 11:21
    cpreviti synchronize #2267
  • Jan 31 2019 09:47
    cpreviti synchronize #2267
  • Jan 31 2019 09:27
    cpreviti synchronize #2267
  • Jan 30 2019 20:38
    bernt-matthias commented #2131
  • Jan 30 2019 20:19
    hepcat72 commented #2239
  • Jan 30 2019 19:50
    lparsons commented #2239
  • Jan 30 2019 18:36
    bgruening commented #2268
  • Jan 30 2019 15:23
    nsoranzo commented #2268
  • Jan 30 2019 15:23
    nsoranzo commented #2267
Ghost
@ghost~5772e7e2c2f0db084a206e1b
if we just template out a dockerfile we can do the same for singularity, giving us on-the-fly singularity builds without needing docker, that would be another advantage
and we don’t need host mounts either that aren’t properly clean up
there’s lot’s of advantages to be had if we simplify this
this wouldn’t change anything whatsoever in how you use the mulled-build ecosystem
same commandline flags etc
Ghost
@ghost~5772e7e2c2f0db084a206e1b
and we can also template out the ENV vars that conda activate puts into the environemnt
Björn Grüning
@bgruening
I'm not worried about that. Many people complained about the lua stuff, so there is probably some appeal to use multi-stanged-build - I guess lua was a wrong choice. I just think its a bad timing. This really needs to be tested carefully before we can roll this out. This has a large impact to not only our ecosystem.
Ghost
@ghost~5772e7e2c2f0db084a206e1b
the lua file is quite simple in what it does, I will produce tests for all cases
Björn Grüning
@bgruening
Sorry, I'm in an EOSC meeting, I can not concentrate enough on this topic as it deserves
Ghost
@ghost~5772e7e2c2f0db084a206e1b
alright, no worries
Ghost
@ghost~5772e7e2c2f0db084a206e1b
let’s see, maybe I can save myself the work: involucro/involucro#73
M Bernt
@bernt-matthias
Is there any convention if characters in the help attribute should/need to be quoted (e.g. & < ...)?
Damn I need to quote them here :) lol
Ghost
@ghost~5772e7e2c2f0db084a206e1b
I guess it depends how many of them they are
if it’s utterly unreadable I’d create a token
Nicola Soranzo
@nsoranzo
Use CDATA?
Ghost
@ghost~5772e7e2c2f0db084a206e1b
can’t do that within an attribute (AFAIK)
but it’d work in a token
Nicola Soranzo
@nsoranzo
<param name="foo" type="text" value="">
    <help><![CDATA[
I'm using weird stuff here >&<
    ]]></help>
</param>
I think that should work, if it is worth
Ghost
@ghost~5772e7e2c2f0db084a206e1b
neat, didn’t know that
M Bernt
@bernt-matthias
Maybe I will try, actually I have the same problem for the value attribute of a param in a test :)
Ghost
@ghost~5772e7e2c2f0db084a206e1b
I guess the same trick would work there ?
M Bernt
@bernt-matthias
Will give it a try .. Thanks
M Bernt
@bernt-matthias
Seems to work.

Maybe more tricky question: I have a string parameter that should accept a list of space separated strings. Strings containing spaces need to be quoted by the user. I already have a validator + sanitizer to check for this.

Now the questions: Is there an easy way to transform the string such that each string is quoted (without requiring the user to do this)? As far as I have seen mapping only works on single characters?

M Bernt
@bernt-matthias
I guess usually a repeat would be the preferred choice but those lack the possibility (as far as I know) to specify default values for a default number of repeats.
Björn Grüning
@bgruening
@bernt-matthias maybe a repeat is easier here?
M Bernt
@bernt-matthias
I need this for the CTD converter. And to me it seems that I can not define a repeat with a default of x repeat units and specify separate default values for each repeat unit ... which I need for the 1:1 translation of the CTD to Galaxy-xml
So currently I hope that I can implement this as space separated string, because this is what the OpenMS tools expect anyway.
John Chilton
@jmchilton
I'm having a hard to imaging what you're describing - can you paste or link a small CTD example?
(of this repeat with different defaults per unit)
Martin Cech
@martenson
Ghost
@ghost~5772e7e2c2f0db084a206e1b
Is this really Heng Li’s bwa ?
looks weird
Martin Cech
@martenson
he is an author on the paper
Ghost
@ghost~5772e7e2c2f0db084a206e1b
I am aware, and he made an announcement, but all his other projects are in the lh3 namespace
paper is a good indicator though
ok, there’s also a link from https://github.com/lh3/bwa
Ghost
@ghost~5772e7e2c2f0db084a206e1b
not recommended for production uses at the moment
idk, let’s wait
Brad Langhorst
@bwlang
i’ve tested this… it’s fast.
produced exactly the same alignments in my small test
Ghost
@ghost~5772e7e2c2f0db084a206e1b
does it have the same options ?
Brad Langhorst
@bwlang
I didn’t compare carefully - it did not strike me as different though
Usage: bwa2 mem [options] <idxbase> <in1.fq> [in2.fq]
Options:
  Algorithm options:
    -o STR        Output SAM file name
    -t INT        number of threads [1]
    -k INT        minimum seed length [19]
    -w INT        band width for banded alignment [100]
    -d INT        off-diagonal X-dropoff [100]
    -r FLOAT      look for internal seeds inside a seed longer than {-k} * FLOAT [1.5]
    -y INT        seed occurrence for the 3rd round seeding [20]
    -c INT        skip seeds with more than INT occurrences [500]
    -D FLOAT      drop chains shorter than FLOAT fraction of the longest overlapping chain [0.50]
    -W INT        discard a chain if seeded bases shorter than INT [0]
    -m INT        perform at most INT rounds of mate rescues for each read [50]
    -S            skip mate rescue
    -o            output file name missing
    -P            skip pairing; mate rescue performed unless -S also in use
Scoring options:
   -A INT        score for a sequence match, which scales options -TdBOELU unless overridden [1]
   -B INT        penalty for a mismatch [4]
   -O INT[,INT]  gap open penalties for deletions and insertions [6,6]
   -E INT[,INT]  gap extension penalty; a gap of size k cost '{-O} + {-E}*k' [1,1]
   -L INT[,INT]  penalty for 5'- and 3'-end clipping [5,5]
   -U INT        penalty for an unpaired read pair [17]
Input/output options:
   -p            smart pairing (ignoring in2.fq)
   -R STR        read group header line such as '@RG\tID:foo\tSM:bar' [null]
   -H STR/FILE   insert STR to header if it starts with @; or insert lines in FILE [null]
   -j            treat ALT contigs as part of the primary assembly (i.e. ignore <idxbase>.alt file)
   -v INT        verbose level: 1=error, 2=warning, 3=message, 4+=debugging [3]
   -T INT        minimum score to output [30]
   -h INT[,INT]  if there are <INT hits with score >80% of the max score, output all in XA [5,200]
   -a            output all alignments for SE or unpaired PE
   -C            append FASTA/FASTQ comment to SAM output
   -V            output the reference FASTA header in the XR tag
   -Y            use soft clipping for supplementary alignments
   -M            mark shorter split hits as secondary
   -I FLOAT[,FLOAT[,INT[,INT]]]
                 specify the mean, standard deviation (10% of the mean if absent), max
                 (4 sigma from the mean if absent) and min of the insert size distribution.
                 FR orientation only. [inferred]
Note: Please read the man page for detailed description of the command line and options.
looks like a superset of bwa
Usage: bwa mem [options] <idxbase> <in1.fq> [in2.fq]

Algorithm options:

       -t INT        number of threads [1]
       -k INT        minimum seed length [19]
       -w INT        band width for banded alignment [100]
       -d INT        off-diagonal X-dropoff [100]
       -r FLOAT      look for internal seeds inside a seed longer than {-k} * FLOAT [1.5]
       -y INT        seed occurrence for the 3rd round seeding [20]
       -c INT        skip seeds with more than INT occurrences [500]
       -D FLOAT      drop chains shorter than FLOAT fraction of the longest overlapping chain [0.50]
       -W INT        discard a chain if seeded bases shorter than INT [0]
       -m INT        perform at most INT rounds of mate rescues for each read [50]
       -S            skip mate rescue
       -P            skip pairing; mate rescue performed unless -S also in use
       -e            discard full-length exact matches

Scoring options:

       -A INT        score for a sequence match, which scales options -TdBOELU unless overridden [1]
       -B INT        penalty for a mismatch [4]
       -O INT[,INT]  gap open penalties for deletions and insertions [6,6]
       -E INT[,INT]  gap extension penalty; a gap of size k cost '{-O} + {-E}*k' [1,1]
       -L INT[,INT]  penalty for 5'- and 3'-end clipping [5,5]
       -U INT        penalty for an unpaired read pair [17]

       -x STR        read type. Setting -x changes multiple parameters unless overriden [null]
                     pacbio: -k17 -W40 -r10 -A1 -B1 -O1 -E1 -L0  (PacBio reads to ref)
                     ont2d: -k14 -W20 -r10 -A1 -B1 -O1 -E1 -L0  (Oxford Nanopore 2D-reads to ref)
                     intractg: -B9 -O16 -L5  (intra-species contigs to ref)

Input/output options:

       -p            smart pairing (ignoring in2.fq)
       -R STR        read group header line such as '@RG\tID:foo\tSM:bar' [null]
       -H STR/FILE   insert STR to header if it starts with @; or insert lines in FILE [null]
       -j            treat ALT contigs as part of the primary assembly (i.e. ignore <idxbase>.alt file)

       -v INT        verbose level: 1=error, 2=warning, 3=message, 4+=debugging [3]
       -T INT        minimum score to output [30]
       -h INT[,INT]  if there are <INT hits with score >80% of the max score, output all in XA [5,200]
       -a            output all alignments for SE or unpaired PE
       -C            append FASTA/FASTQ comment to SAM output
       -V            output the reference FASTA header in the XR tag
       -Y            use soft clipping for supplementary alignments
       -M            mark shorter split hits as secondary

       -I FLOAT[,FLOAT[,INT[,INT]]]
                     specify the mean, standard deviation (10% of the mean if absent), max
                     (4 sigma from the mean if absent) and min of the insert size distribution.
                     FR orientation only. [inferred]

Note: Please read the man page for detailed description of the command line and options.
Ghost
@ghost~5772e7e2c2f0db084a206e1b
in principle our caching approach is much more fine-grained in that we can decide which parameters need to match
but we don’t have the UI and we need dataset hashes to make this really efficient and useful
Martin Cech
@martenson
wrong channel :)