These are chat archives for nextflow-io/nextflow

16th
Jan 2018
Manabu ISHII
@manabuishii
Jan 16 2018 09:35 UTC

Is there any option about no verbose message when nextflow creates dot file.

nextflow run workflow.nf -with-dag output.dot

typically --silent or --quiet ...
But I can not find any option.

Paolo Di Tommaso
@pditommaso
Jan 16 2018 09:37 UTC
nextflow -q run workflow.nf -with-dag output.dot
Manabu ISHII
@manabuishii
Jan 16 2018 09:43 UTC
@pditommaso Thanks.
This command just create output.dot ? or execute workflow.nf ?
I only want a output.dot.
Paolo Di Tommaso
@pditommaso
Jan 16 2018 09:43 UTC
unfortunately it's not possible, NF creates the DAG at runtime :/
Manabu ISHII
@manabuishii
Jan 16 2018 09:45 UTC
@pditommaso Thanks information.
Paolo Di Tommaso
@pditommaso
Jan 16 2018 09:45 UTC
you are welcome
Tim Diels
@timdiels
Jan 16 2018 12:23 UTC
I've been given a pipeline which tends to read/write a lot to a shared database and considering to replace the custom build engine with Nextflow. It seems to me the use of a shared database for read/write breaks job isolation, caching and makes it tricky to run the jobs concurrently because one has to manually check whether 2 jobs could affect each other. Would you agree it's generally a good idea to try to get rid of the database and instead use files and small temporary sqlite databases which get copied when modified for data which doesn't fit in memory? Or are there better alternatives?
Paolo Di Tommaso
@pditommaso
Jan 16 2018 12:25 UTC
that's just the idea behind NF
using DB is just a bad idea in most cases
Tim Diels
@timdiels
Jan 16 2018 12:29 UTC
Good, I hope I'll be able to convince the others then :P
Paolo Di Tommaso
@pditommaso
Jan 16 2018 12:29 UTC
:)
marchoeppner
@marchoeppner
Jan 16 2018 13:46 UTC
Hi everyone. Am stuck on a "groupBy" problem. I parse as input a CSV file that has the following columns: sampleID,libraryID,R1,R2 - and I want to group the resulting output so that I can merge read pairs by library ID first (if one library ran across multiple lanes). But I cant figure out how to make groupBy or groupTuple work with hash maps (?)
which is what seems to come out of splitCsv
Paolo Di Tommaso
@pditommaso
Jan 16 2018 13:49 UTC
Channel.fromPath('/your/csv/file')
    .map { sampleId, libraryId, R1, R2 -> [ libraryId, sampleId, file(R1), file(R2) ] }
    .groupTuple()
marchoeppner
@marchoeppner
Jan 16 2018 13:49 UTC
ah... thanks!
Paolo Di Tommaso
@pditommaso
Jan 16 2018 13:50 UTC
note that libraryId is returned as first element, sogroupTuple with use it as grouping key
marchoeppner
@marchoeppner
Jan 16 2018 13:51 UTC
right, that makes sense
marchoeppner
@marchoeppner
Jan 16 2018 14:01 UTC
isn't the example missing the splitCsv bit?
Paolo Di Tommaso
@pditommaso
Jan 16 2018 14:05 UTC
oh yes, sorry, after fromPath
marchoeppner
@marchoeppner
Jan 16 2018 14:05 UTC
am getting this now:
ERROR ~ No signature of method: _nf_script_ada9dd56$_run_closure2.call() is applicable for argument types: (java.util.LinkedHashMap) values: [[SampleID:E08364-L3, libraryID:E08364-L3, R1:/ifs/data/nfs_share/sukmb352/projects/acinetobacter_data/follow_up/E08364-L3_S2_L004_R1_001.fastq.gz, ...]]
Possible solutions: any(), any(), each(groovy.lang.Closure), any(groovy.lang.Closure), each(groovy.lang.Closure), any(groovy.lang.Closure)
Paolo Di Tommaso
@pditommaso
Jan 16 2018 14:07 UTC
how is your snippet
marchoeppner
@marchoeppner
Jan 16 2018 14:07 UTC
Channel.from(inputFile)
        .splitCsv(sep: ';', header: true)
        .map { sampleID, libraryID, R1, R2 -> [ libraryID, sampleID, file(R1), file(R2) ] }
        .groupTuple()
        .set{ inputReads }

process Merge {

        tag "${libraryID}"
        publishDir("${OUTDIR}/Data/${libraryID}")

        input:
        set libraryID,sampleID,forward_reads,reverse_reads from inputReads

        output:
        set libraryID,file(left_merged),file(right_merged) into inputTrimgalore

        script:
        left_merged = libraryID + "_R1.fastq.gz"
        right_merged = libraryID + "_R2.fastq.gz"

        """
                zcat ${forward_reads.join(" ")} | gzip > $left_merged
                zcat ${reverse_reads.join(" ")} | gzip > $right_merged
        """
}
Paolo Di Tommaso
@pditommaso
Jan 16 2018 14:09 UTC
what NF version are you using ?
$ nextflow -version
marchoeppner
@marchoeppner
Jan 16 2018 14:10 UTC
0.25.5, also tried with 0.26.0
Paolo Di Tommaso
@pditommaso
Jan 16 2018 14:13 UTC
ah, the problem is header:true
when you specify that splitCsv return an associative array for each entry instead of a tuple (list)
use skip:1 instead
then of course is Channel.fromPath(inputFile) instead of Channel.from(inputFile)
marchoeppner
@marchoeppner
Jan 16 2018 14:17 UTC
glad it wasn't just my stupidity ;) seems to work now, thanks!
Paolo Di Tommaso
@pditommaso
Jan 16 2018 14:19 UTC
actually a bit tricky ..
welcome!
marchoeppner
@marchoeppner
Jan 16 2018 14:25 UTC
for the groupTuple bit, I suppose I can also do use a "by: [ 0,1]" or similar?
Paolo Di Tommaso
@pditommaso
Jan 16 2018 14:26 UTC
yes!