These are chat archives for nextflow-io/nextflow

16th
Jan 2018
Manabu ISHII
@manabuishii
Jan 16 2018 09:35

Is there any option about no verbose message when nextflow creates dot file.

nextflow run workflow.nf -with-dag output.dot

typically --silent or --quiet ...
But I can not find any option.

Paolo Di Tommaso
@pditommaso
Jan 16 2018 09:37
nextflow -q run workflow.nf -with-dag output.dot
Manabu ISHII
@manabuishii
Jan 16 2018 09:43
@pditommaso Thanks.
This command just create output.dot ? or execute workflow.nf ?
I only want a output.dot.
Paolo Di Tommaso
@pditommaso
Jan 16 2018 09:43
unfortunately it's not possible, NF creates the DAG at runtime :/
Manabu ISHII
@manabuishii
Jan 16 2018 09:45
@pditommaso Thanks information.
Paolo Di Tommaso
@pditommaso
Jan 16 2018 09:45
you are welcome
Tim Diels
@timdiels
Jan 16 2018 12:23
I've been given a pipeline which tends to read/write a lot to a shared database and considering to replace the custom build engine with Nextflow. It seems to me the use of a shared database for read/write breaks job isolation, caching and makes it tricky to run the jobs concurrently because one has to manually check whether 2 jobs could affect each other. Would you agree it's generally a good idea to try to get rid of the database and instead use files and small temporary sqlite databases which get copied when modified for data which doesn't fit in memory? Or are there better alternatives?
Paolo Di Tommaso
@pditommaso
Jan 16 2018 12:25
that's just the idea behind NF
using DB is just a bad idea in most cases
Tim Diels
@timdiels
Jan 16 2018 12:29
Good, I hope I'll be able to convince the others then :P
Paolo Di Tommaso
@pditommaso
Jan 16 2018 12:29
:)
marchoeppner
@marchoeppner
Jan 16 2018 13:46
Hi everyone. Am stuck on a "groupBy" problem. I parse as input a CSV file that has the following columns: sampleID,libraryID,R1,R2 - and I want to group the resulting output so that I can merge read pairs by library ID first (if one library ran across multiple lanes). But I cant figure out how to make groupBy or groupTuple work with hash maps (?)
which is what seems to come out of splitCsv
Paolo Di Tommaso
@pditommaso
Jan 16 2018 13:49
Channel.fromPath('/your/csv/file')
    .map { sampleId, libraryId, R1, R2 -> [ libraryId, sampleId, file(R1), file(R2) ] }
    .groupTuple()
marchoeppner
@marchoeppner
Jan 16 2018 13:49
ah... thanks!
Paolo Di Tommaso
@pditommaso
Jan 16 2018 13:50
note that libraryId is returned as first element, sogroupTuple with use it as grouping key
marchoeppner
@marchoeppner
Jan 16 2018 13:51
right, that makes sense
marchoeppner
@marchoeppner
Jan 16 2018 14:01
isn't the example missing the splitCsv bit?
Paolo Di Tommaso
@pditommaso
Jan 16 2018 14:05
oh yes, sorry, after fromPath
marchoeppner
@marchoeppner
Jan 16 2018 14:05
am getting this now:
ERROR ~ No signature of method: _nf_script_ada9dd56$_run_closure2.call() is applicable for argument types: (java.util.LinkedHashMap) values: [[SampleID:E08364-L3, libraryID:E08364-L3, R1:/ifs/data/nfs_share/sukmb352/projects/acinetobacter_data/follow_up/E08364-L3_S2_L004_R1_001.fastq.gz, ...]]
Possible solutions: any(), any(), each(groovy.lang.Closure), any(groovy.lang.Closure), each(groovy.lang.Closure), any(groovy.lang.Closure)
Paolo Di Tommaso
@pditommaso
Jan 16 2018 14:07
how is your snippet
marchoeppner
@marchoeppner
Jan 16 2018 14:07
Channel.from(inputFile)
        .splitCsv(sep: ';', header: true)
        .map { sampleID, libraryID, R1, R2 -> [ libraryID, sampleID, file(R1), file(R2) ] }
        .groupTuple()
        .set{ inputReads }

process Merge {

        tag "${libraryID}"
        publishDir("${OUTDIR}/Data/${libraryID}")

        input:
        set libraryID,sampleID,forward_reads,reverse_reads from inputReads

        output:
        set libraryID,file(left_merged),file(right_merged) into inputTrimgalore

        script:
        left_merged = libraryID + "_R1.fastq.gz"
        right_merged = libraryID + "_R2.fastq.gz"

        """
                zcat ${forward_reads.join(" ")} | gzip > $left_merged
                zcat ${reverse_reads.join(" ")} | gzip > $right_merged
        """
}
Paolo Di Tommaso
@pditommaso
Jan 16 2018 14:09
what NF version are you using ?
$ nextflow -version
marchoeppner
@marchoeppner
Jan 16 2018 14:10
0.25.5, also tried with 0.26.0
Paolo Di Tommaso
@pditommaso
Jan 16 2018 14:13
ah, the problem is header:true
when you specify that splitCsv return an associative array for each entry instead of a tuple (list)
use skip:1 instead
then of course is Channel.fromPath(inputFile) instead of Channel.from(inputFile)
marchoeppner
@marchoeppner
Jan 16 2018 14:17
glad it wasn't just my stupidity ;) seems to work now, thanks!
Paolo Di Tommaso
@pditommaso
Jan 16 2018 14:19
actually a bit tricky ..
welcome!
marchoeppner
@marchoeppner
Jan 16 2018 14:25
for the groupTuple bit, I suppose I can also do use a "by: [ 0,1]" or similar?
Paolo Di Tommaso
@pditommaso
Jan 16 2018 14:26
yes!