These are chat archives for nextflow-io/nextflow

2nd
May 2019
Chris Jackson
@Jackson_Chris_J_twitter
May 02 02:01
Hi all. I have just started using Nextflow today (looks great!). However, when running the ‘your first script’ code from https://www.nextflow.io/docs/latest/getstarted.html, I’m finding that the ‘HELLO\n WORLD’ output is being overwritten by the Nextflow information messages (excecutor > local, [71/f3d836] process > splitLetters [100%] 1 of 1, etc). I can see the words ‘HELLO’ and ‘ WORLD’ appear very briefly, but then they are overwritten. If I re-run the script with the -resume flag I can sometimes see ‘HELLO’ or ‘WORLD’, but not always (when repeating the run). I’m using Nexflow 19.04.0. Is there an obvious solution to this that I’m missing?
Rad Suchecki
@rsuchecki
May 02 02:35
Recently the default progress reporting mode has changed, go back to previous console output -ansi-log false
Chris Jackson
@Jackson_Chris_J_twitter
May 02 02:36
Beautiful - thanks!
Rad Suchecki
@rsuchecki
May 02 02:37
This message was deleted
:+1:
Chelsea Sawyer
@csawye01
May 02 10:50
I'm wanting to turn a channel into a groovy Map but the code I've written will only map one value from the channel into the groovy map. Is there a better way to do this or a step I've missed?
tuple_ch = Channel.from( ["MultiQC", "multiqc_report.html"], ["Index", "index.html"] )
def mapped_project_multiqc = [:]
tuple_ch.map{ left, right -> mapped_project_multiqc[left] = right }
Paolo Di Tommaso
@pditommaso
May 02 11:28
this won't work, use reduce instead
Shawn Rynearson
@srynobio
May 02 12:27
Thanks @rsuchecki for the .trim() examples, they are just what I need.
Ólavur Mortensen
@olavurmortensen
May 02 14:32

I have an issue where the first process seems to be called multiple times, even though there aren't multiple inputs, and uses the same working directory. In the example below, you an see that the process consolidate_gvcf is called multiple times, using the working directory with the prefix f0/67b1c7, and you can see several executor> local([number]) statements (in fact, there are 6).

Can anybody help me here? I don't know how to debug this problem, I can't find anything in the code or config that would cause this.

P I P E L I N E     I P U T S    
=================================
gvcf_path          : data/results/gvcf
reference          : resources/reference_10x_Genomics/refdata-GRCh38-2.1.0
dbsnp              : resources/gatk_bundle/Homo_sapiens_assembly38.dbsnp138/Homo_sapiens_assembly38.dbsnp138.vcf
mills              : resources/gatk_bundle/Mills_and_1000G_gold_standard.indels.hg38/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
kGphase1           : resources/gatk_bundle/1000G_phase1.snps.high_confidence.hg38/1000G_phase1.snps.high_confidence.hg38.vcf.gz
kGphase3           : resources/gatk_bundle/1000G.phase3.integrated.sites_only.no_MATCHED_REV.hg38/1000G.phase3.integrated.sites_only.no_MATCHED_REV.hg38.vcf
omni               : resources/gatk_bundle/1000G_omni2.5.hg38/1000G_omni2.5.hg38.vcf.gz
hapmap             : resources/gatk_bundle/hapmap_3.3.hg38.vcf.gz/hapmap_3.3.hg38.vcf.gz
targets            : resources/sureselect_human_all_exon_v6_utr_grch38/S07604624_Padded.bed
threads            : 20
mem                : 200
outdir             : data/results
executor >  local (1)
[f0/67b1c7] process > consolidate_gvcf [100%] 1 of 1 ✔
executor >  local (4)
[f0/67b1c7] process > consolidate_gvcf [100%] 1 of 1 ✔
[dc/4a53a2] process > joint_genotyping [100%] 1 of 1 ✔
[16/51e13d] process > get_snps         [100%] 1 of 1 ✔
executor >  local (8)
[f0/67b1c7] process > consolidate_gvcf   [100%] 1 of 1
Paolo Di Tommaso
@pditommaso
May 02 14:35
I think it's just a matter of broken layout
do you have any println in your script?
Ólavur Mortensen
@olavurmortensen
May 02 14:37
I do have println statements, only the following:
println "P I P E L I N E     I P U T S    "
println "================================="
println "gvcf_path          : ${params.gvcf_path}"
println "reference          : ${params.reference}"
println "dbsnp              : ${params.dbsnp}"
println "mills              : ${params.mills}"
println "kGphase1           : ${params.kGphase1}"
println "kGphase3           : ${params.kGphase3}"
println "omni               : ${params.omni}"
println "hapmap             : ${params.hapmap}"
println "targets            : ${params.targets}"
println "threads            : ${params.threads}"
println "mem                : ${params.mem}"
println "outdir             : ${params.outdir}"
Paolo Di Tommaso
@pditommaso
May 02 14:38
or maybe any process with echo true ?
Ólavur Mortensen
@olavurmortensen
May 02 14:38
yeah
Paolo Di Tommaso
@pditommaso
May 02 14:40
I think it's breaking the logging .. or it maybe a glitch in the new logging
just ignore it or remove the echo or use the option -ansi-log false
Ólavur Mortensen
@olavurmortensen
May 02 14:41
Yeah that was the reason :) It works now. I'm only using echo true for testing purposes, ensuring that everything runs as is supposed to, inputs and outputs and such.
There are other ways to do this, so it's fine :D
Thanks for the help
Paolo Di Tommaso
@pditommaso
May 02 14:42
:+1:
David Trudgian
@dctrud
May 02 15:26
Hello all. Wondering if nextflow run -with-mpi should be respecting the -cluster.join path:xxxxxx option - or if I have to manually start nextflow nodes to use a filesystem path for co-ordination?
Having zero success at present trying to get Ignite workers (which start correctly on multiple cluster nodes) to be noticed. everything just running on the master.
Chelsea Sawyer
@csawye01
May 02 15:34
@pditommaso Is there a workflow.onStart (like workflow.onComplete) event handler for pipelines? It would be useful if I could have the pipeline to send a notification email that says the workflow has started as it will be automatically triggered at the presence of specific files in a directory.
David Trudgian
@dctrud
May 02 15:45

It looks like if I launch with srun --exclusive --cpus-per-task=32 --distribution=cyclic nextflow run main.nf -cluster.join path:/home2/dtrudgian/Git/astrocyte_example_wordcount_ignite/workflow/nf_ignite_14320632/state -with-mpi -with-trace -with-timeline then only the master is using the fs path for discovery.

nf_ignite_14320632/state:
0_0_0_0_0_0_0_1%lo#47500  10.100.161.68#47500  127.0.0.1#47500  172.18.225.68#47500

In the .nextflow_node_X.log files the other Ignite workers on other nodes are trying to use multicast

Eugene Bragin
@eugene.bragin_gitlab
May 02 16:12
I'm planning to use memory { 8.GB * task.attempt } for my assembly process. The assembler I'm using (spades) accepts -m INT directive, would I be able to use -m ${task.memory}? As in - will this variable return integer or string with 'GB'?
Eugene Bragin
@eugene.bragin_gitlab
May 02 16:23
ok after trying and looking into memoryunit class I recon I need to do ${task.memory.giga}
David Trudgian
@dctrud
May 02 17:03
(r.e. ignite) - running the nextflow node workers and nextflow run master separately does indeed propogate the -cluster.join option everywhere and things work as expected.
This is the kind of pattern working for me, if anyone else hits this.... https://gist.github.com/dctrud/4458b8b9a9d3dc559caed64451f48cb5
Paolo Di Tommaso
@pditommaso
May 02 17:18
@csawye01 open a feature request describing your use case
@dctrud cool approach