These are chat archives for nextflow-io/nextflow

9th
Oct 2017
Daniel E Cook
@danielecook
Oct 09 2017 00:58
Spent the last day racking my brains because I was using --with-docker instead of -with-docker; Perhaps these parameter names should be reserved or warned against?
Paolo Di Tommaso
@pditommaso
Oct 09 2017 07:37
oops.. I think that a warning is already implemented in some branch. need to check !
Anthony Underwood
@aunderwo
Oct 09 2017 13:18
@pditommaso I remember you once said Docker for dev and Singularity for prod. Any reason why you'd not go straight to Singularity or is the build process for Docker images easier/quicker
?
Paolo Di Tommaso
@pditommaso
Oct 09 2017 13:20
because we tend to work on local computer on which Docker is better supported
unless you develop in a linux system ..
Anthony Underwood
@aunderwo
Oct 09 2017 13:20
on a Mac?
Paolo Di Tommaso
@pditommaso
Oct 09 2017 13:20
yep
also docker is de-fact image standard
Anthony Underwood
@aunderwo
Oct 09 2017 13:21
right
Paolo Di Tommaso
@pditommaso
Oct 09 2017 13:21
IMO makes sense to store docker images in a Docker registry, which can be used by different systems
Anthony Underwood
@aunderwo
Oct 09 2017 13:21
I guess for "shareability" that makes sense
Paolo Di Tommaso
@pditommaso
Oct 09 2017 13:22
however in a large singularity based system could have sense to go directly with singularity images
Anthony Underwood
@aunderwo
Oct 09 2017 13:23
Then run the docker2singularity to convert? It's always worked for me but do you have any cases where this hasn't?
Paolo Di Tommaso
@pditommaso
Oct 09 2017 13:24
nope using singularity pull docker://etc
NF has a built-in support for that
Anthony Underwood
@aunderwo
Oct 09 2017 13:25
Ah ok - that assumes a public docker repo. What if it's private?
Paolo Di Tommaso
@pditommaso
Oct 09 2017 13:25
good point
You will need to pull/convert manually (for now)
Anthony Underwood
@aunderwo
Oct 09 2017 13:28
using the docker2singularity docker image?
Paolo Di Tommaso
@pditommaso
Oct 09 2017 13:29
I think that tool is deprecated, singularity folks suggest to use singularity pull docker://etc (ver 2.3.x)
and singularity build image docker://etc (ver 2.4)
Luca Cozzuto
@lucacozzuto
Oct 09 2017 13:30
Well let me tell you that I use docker2singularity and it works
and if you want to convert something without the use of docker hub it is the only way
otherwise you will have trouble with environmental variables
Anthony Underwood
@aunderwo
Oct 09 2017 13:33
what would be the manual convert process? I am using docker2singularity docker image and it works but if there is a better way for non-docker hub images I should be best practice
Paolo Di Tommaso
@pditommaso
Oct 09 2017 13:35
if you are not storing into a registry, I think the only way is docker2singularity
still, I have the feeling that the best approach is to use a prive/local registry and convert with singularity pull/build
Anthony Underwood
@aunderwo
Oct 09 2017 13:36
Just asking over on the singularity slack group
Paolo Di Tommaso
@pditommaso
Oct 09 2017 13:36
:+1:
Anthony Underwood
@aunderwo
Oct 09 2017 13:54
Is it possible to have more than 2 items in a tuple for a channel?
i.e can I do
set val(pair_id), file("*.fastq"), file("output.tsv") into channel1
Paolo Di Tommaso
@pditommaso
Oct 09 2017 13:55
as many as you want ..
Venkat Malladi
@vsmalladi
Oct 09 2017 14:03
@aunderwo ya i do that
set sampleId, file('*.fq.gz'), biosample, factor, treatment, replicate, controlId into trimmedReads
Ali Al-Hilli
@Ali_Hilli_twitter
Oct 09 2017 14:31

What is the best way to use pair_id in other processes when defining it in one process:

Channel
  .fromFilePairs( input_fastqs )
  .ifEmpty { error "Cannot find any reads matching: ${input_fastqs}" }
  .set { read_pairs }

process qa_and_trim {

    publishDir "${output_dir}/${pair_id}", mode: 'copy'

    module 'phe/qa_and_trim'

    input:
    set pair_id, file(file_pair) from read_pairs
}

The reason why I want it in the other processes is because I want the outputs to go into "${output_dir}/${pair_id}" for all the processes

Paolo Di Tommaso
@pditommaso
Oct 09 2017 14:32
to put in the output eg
process qa_and_trim {

    publishDir "${output_dir}/${pair_id}", mode: 'copy'

    module 'phe/qa_and_trim'

    input:
    set pair_id, file(file_pair) from read_pairs

    output:
    set pair_id, file(..) into something 
}
Simone Baffelli
@baffelli
Oct 09 2017 14:34
Hello. I am encountering a bit of a problem with the staging of input files: I'm combining a channel with a copy of itself because I'm interested in the combination of all files with all others. However, each of these files come from a u upstream process and get the same name slcRes, thus in the process that needs them I only see one of them because somehow both files are staged with their original names.
Anthony Underwood
@aunderwo
Oct 09 2017 14:34
@pditommaso @Ali_Hilli_twitter and my questions were linked :) Coupling lots of outputs into a single channel seems to be the way to go!
Paolo Di Tommaso
@pditommaso
Oct 09 2017 14:36
@baffelli you need to give them different names . .
Simone Baffelli
@baffelli
Oct 09 2017 14:36
found it out
This should work right?
            set file(master:"master"), file(masterPar:"masterPar"), val(masterId), file(slaveInterp:"slave"), 
            file(slaveParInterp:"slavePar"), val(slaveId), file(off_par) into coreg
I was confusing the name given to a file in the script context with the filename exposed to the fs when it is staged in the process directory
Paolo Di Tommaso
@pditommaso
Oct 09 2017 14:37
it should, but I'm not 100% sure
Simone Baffelli
@baffelli
Oct 09 2017 14:38
does not seem to at the moment
i'll try again
Anthony Underwood
@aunderwo
Oct 09 2017 14:39
@baffelli interesting what does the syntax file(master:"master") do?
Simone Baffelli
@baffelli
Oct 09 2017 14:39
it should stage the files with a given name
at least it does that with list of files
but it looks like it won't work with single files
4-th entry in the table
Ok I'm very stupid, I was using it for the ouput instead than for the input
Anthony Underwood
@aunderwo
Oct 09 2017 14:43
OK that's new syntax to me - how do you use the staged name?
Simone Baffelli
@baffelli
Oct 09 2017 14:43
If you wait a minute I can show you a complete example
Anthony Underwood
@aunderwo
Oct 09 2017 14:43
Cool :thumbsup:
Simone Baffelli
@baffelli
Oct 09 2017 14:48
process coregister{


        input:
            set file(master:"master.slc"), file(masterPar:"master.slc.par"), val(masterId), file(slave:"slave.slc"),
            file(slavePar:"slave.slc.par"), val(slaveId)  from secondCoregistration
            val rlks_init from params.rlks_init
            val azlks_init from params.azlks_init
            val azwin from params.azwin
            val rwin from params.rwin
            val azwin_init from params.azwin_init
            val rwin_init from params.rwin_init
            val nr from params.nr
            val naz from params.naz

        output:
            set file(master), file(masterPar), val(masterId), file(slaveInterp), 
            file(slaveParInterp), val(slaveId), file(off_par) into coreg



    script:
        //Choose which method to use to coregister data (power offset or rather interferometric correlation)
        println("a")
        offset_cmd = params.offset_method == "pwr" ?
        " offset_pwr ${master} ${slave} ${slavePar} ${masterPar} off_par offs ccp ${rwin} ${azwin} - - ${nr} ${naz} - - - - -" 
        :"offset_SLC ${master} ${slave} ${slavePar} ${masterPar} off_par offs ccp ${params.ifgram_chip}  ${params.ifgram_chip} - - ${nr} ${naz} - ${params.ifgram_chip} - - -"
        """
        create_offset ${masterPar} ${slavePar} off_par 1 ${rlks_init} ${azlks_init} 0
        init_offset ${master} ${slave} ${masterPar} ${slavePar}  off_par ${rlks_init} ${azlks_init} - - - - - ${rwin_init} ${azwin_init}
        ${offset_cmd}
        offset_fit offs ccp off_par - - - 1
        SLC_interp ${slave} ${slavePar} ${masterPar} off_par slaveInterp slaveParInterp - -
        """
}
Here is it. What this does is to stage the file slave with the local filename slave.slc and the same for master with master.slc etc. This is useful because both master and slave are derived from the same upstream process which produces an output called slcRes. If I would not use that, they both would be staged with the same name, so only the last file to be received would be seen. I discovered it because the difference image between slave and `master was empty.
Don't let the rest of the command intimidate you
Simone Baffelli
@baffelli
Oct 09 2017 15:39

What does that exactly mean:

ERROR ~ Error executing process > 'ifgram (1)'

Caused by:
  master.slc.par

in the context of this process:

        input:
            set file(master), file(masterPar), val(masterId), 
            file(slave), file(slavePar), val(slaveId), file(off_par) from coreg
            val rlks from params.rlks
            val azlks from params.azlks

        output:
            set file(ifgram), file(masterMli), file(slaveMli), file(off_par), val(baseline), val(masterId), val(slaveId) into (ifgram_cc, ifgram_unw)
            file('ifgram.bmp')

        shell:
            log.info("Computing interferogram between ${masterId} and ${slaveId}")
            titleRe = /title:*/
            //compute the baseline
            masterTitleMatcher = (masterPar.text =~ titleRe)
            slaveTitleMatcher = (slavePar.text =~ titleRe)
            baseline=computeBaseline(masterId, slaveId)
            //compute the interferogram and get its size
            //we copy the off par to avoid nextflow sensing that is
            //being modified, as this would prevent it to cache the expensive process
            '''
            cp !{off_par} off_par_post
            SLC_intf !{master} !{slave} !{masterPar} !{slavePar} off_par_post ifgram !{rlks} !{azlks} - - 0 0 1 1 - - - -
            multi_look !{master} !{masterPar} masterMli master_mli_par !{rlks} !{azlks}
            multi_look !{slave} !{slavePar} slaveMli  slave_mli_par !{rlks} !{azlks}
            wd=$(get_value !{off_par} interferogram_width)
            rasmph_pwr ifgram masterMli ${wd} - - - - - - - - ifgram.bmp
            '''

}
I cannot access the file content in a `shell enviornment?
Simone Baffelli
@baffelli
Oct 09 2017 16:02
Somehow in the groovy script I cannot directly resolve the file. It is probably related to #378
Venkat Malladi
@vsmalladi
Oct 09 2017 21:39
anyone able to get the git tag name to be added to the trace file?
Mike Smoot
@mes5k
Oct 09 2017 21:41
@vsmalladi I write the git tag along with a bunch of other metadata to a separate file at the end of my workflows. Since this stuff would be common across all rows in the trace file, I'm not sure it makes a lot of sense to include it there.
Venkat Malladi
@vsmalladi
Oct 09 2017 23:07
@mes5k Ah okay, so you write an additional log file for the git tag. Ya doesn’t make sense to put the same tag for each row. Trying to figure out how to ingest all the metadata and populate into a database.
Mike Smoot
@mes5k
Oct 09 2017 23:16
right, I add a workflow.onComplete { /* save stuff here */ } call at the end of my pipeline. I've got a common function that I share across pipelines as a groovy grape.
Venkat Malladi
@vsmalladi
Oct 09 2017 23:18
thanks @mes5k I will do the same