Nextflow community chat moved to Slack! https://www.nextflow.io/blog/2022/nextflow-is-moving-to-slack.html
pditommaso on master
Bump FUSION_ prefix variables [… (compare)
I'm having a problem defining optional outputs in DSL2. I have a workflow where part of it is optional:
main:
process1()
if (params.run_extra) {
process2()
}
That works fine. But now I want to emit the outputs. I've tried:
emit:
process1.out
if (params.run_extra) {
process2.out
}
But that doesn't seem to be allowed.
Any ideas?
emit:
some_ch = params.run_extra ? process2.out : some_default
subscribe
method. Is it possible to catch an error that is thrown at that point? Basically something like this:
Channel.of(1, 2, 3).toList().subscribe({ list ->
// Raise error
throw new Exception("Something in the list is not valid")
})
// Would like to catch the error and stop execution here
println("Execution continues anyway") // This shouldn't be printed but it is
containerOptions --volume /data/db:/db
on the process section and also singularity.runOptions = '-B /data/db:/db'
but I cannot mout it properly
singularity.autoMounts = true
is defined on my profile section too
echo true
containerOptions '--volume /tmp:/tmp/tmp-mounted'
script:
"""
cat /etc/*release >> /tmp/tmp-mounted/testing-bind.txt
echo "Hello world! From Singularity container" >> /tmp/tmp-mounted/testing-bind.txt
touch /tmp/tmp-mounted/thepipelinehasrun
ls /tmp/tmp-mounted/
cat /tmp/tmp-mounted/testing-bind.txt
"""
}
profiles {
singularity {
singularity.enabled = true
singularity.autoMounts = true
process.container = 'alpine.3.8.simg'
}
}
Hi all,
It seems I still have 'configuration conflict' when I run awsbatch. Like following bug: nextflow-io/nextflow#2370 .
Configuration conflict
This value was submitted using containerOverrides.memory which has been deprecated and was not used as an override. Instead, the MEMORY value found in the job definition’s resourceRequirements key was used instead. More information about the deprecated key can be found in the AWS Batch API documentation.
Nextflow version
Version: 21.10.6 build 5660
Created: 21-12-2021 16:55 UTC
System: Linux 5.11.0-1022-aws
Runtime: Groovy 3.0.9 on OpenJDK 64-Bit Server VM 1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Encoding: UTF-8 (UTF-8)
How to solve this?
if
block in the directives section of a process, and if it's false I want to stop running and fail.
I have a workflow that involves splitting up files per chromosome and then merging them later. To make the workflow a bit more flexible, I first pull the chromosomes out of the reference file using a bit of grep, so I have a channel with all the chromosome names. I can then do something like this:
input:
tuple val(id), path(file) from channel_a
each chr from chromosomes
output:
tuple val(id), val(chr), path(outfile) into channel_b
then I can group things up:
channel_b.groupTuple(by: 0)
and use that as input for the next process.
My question is, since the number of chromosomes is constant for any given run of the workflow, can I extract that value (e.g. map{ it.readLines().size() }
) and feed that into groupTuple? I thought perhaps I could assign this value to a variable and then pass that variable to the groupTuple call but this doesn't work (the type of the variable is something fancy, not an Int).
Hi, this config worked for me :
singularity.enabled = true singularity.runOptions = "--bind /path:/path"
many thanks for your suggerence @tomraulet , finally Is working for me to!
can I generate a val ouput emit from a script inside a process?
I tried to emit as val
process test {
output:
val val_var , emit: val_var
shell:
"""
val_var=test
"""
}
Error:
Caused by:
Missing value declared as output parameter: val_var
I also tried to emit as env
process test{
output:
env val_var , emit: val_var
shell:
"""
val_var=test
"""
}
When I tried to use it in the downstream process by calling test.out.val_var
Caused by:
No such property: val_var for class: ScriptC3863517AF925202A24F63BCD0003707
hello, I'm just starting out in nextflow and seeking some strategic advice. What I'm trying to achieve is to run a workflow for a list of input samples. I can setup a workflow for a single sample, but how do I push multiple samples through it? (most examples in the docs show a single process) Here's where I'm stuck:
params.samples = ["samples/a","samples/b","samples/c"]
process step1 {
input:
file sample from samples_ch
output:
file 'result.bam' into step1_ch
...
}
process step2 {
input:
file bam from step1_ch
output:
file 'result.vcf' into step2_ch
...
}
This runs for sample a but not the rest, I'm suspecting because step2 only accepts one thing from step1_ch?
I can see two general strategies, either make a workflow for a single sample and then import that into a multi-sample wrapper, or enable each process to accept multiple inputs? Any advice would be greatly appreciated! Thanks
.def
file with singularity build. This proved to work, but I am unsure how to make this portable. What are best practices regarding images, and how should i do this to optimize functionality and user-friendlyness?
if I see that correctly, the error strategy "ignore" will lead to "workflow.success"=true at the end.
I would like if running many samples to indeed ignore one failing one, but then check at the very end if any of them failed.
Is there a trace/object which can be accessed in main.nf where one could verify that at the end ? Something like
Sample | success |
---|---|
1 | true |
1 | true |
1 | true |
1 | true |
1 | false |
1 | true |
1 | true |
This would allow to clean e.g. published data which would be otherwise orphan files
I am having an issue where an imported module has an implicit workflow.onComplete handler. When I run the main workflow, the imported workflow.onComplete handler is being triggered, I assume because wf2's "workflow" is in the wf1 namespace. Example code:
# wf1.nf
include { subworkflow } from "./wf2"
// wf1 implicit workflow
workflow {
main:
println('wf1 implicit workflow called')
}
// pulls wf2 implicit workflow.onComplete into this namespace and executes
----------------------------------------
#wf2.nf
//explicitly named workflow that is imported to wf1
workflow subworkflow {
main:
println('wf2 as subworkflow called')
}
// wf2 implicit workflow
workflow {
main:
println('wf2 implicit workflow called')
}
// wf2 implicit workflow.onComplete handler
workflow.onComplete {
log.info('wf2 implicit workflow completed')
}
Command and output is:
$ nextflow run wf1.nf
N E X T F L O W ~ version 21.10.2
Launching `wf1.nf` [awesome_euclid] - revision: e050c16fcb
wf1 implicit workflow called
wf2 implicit workflow completed
Is there a way to avoid this namespace clash while keeping the workflow.onComplete handler for wf2? Or do I need to pull out the subworkflow in example above to it's own separate file and have wf1 import directly from that?
output:
tuple val(sampleId), path("fastq_files/*_R{1,2}_001.fastq.gz"), emit: fastq
I have a situation where I need some dynamic input values for a process which I will need to fetch from a datastore as part of the pipeline. I was wondering what the best/accepted way of getting these values available to the process as variables is? My initial thought is to have the script that grabs the values from the datastore output a JSON file and then use a JSON reader in the process that requires them to access them?
Something like:
proc1 {
output:
path patient_data.json
script:
"""
python get_patient_data.py
"""
}
proc2 {
input:
path patient_data_file
path other_file
output:
path some_output.file
script:
patient_data = jsonSlurper.parse(patient_data_file)
"""
the_command --opt1 ${patient_data['val1']} --opt2 ${patient_data['val2']} other_file
"""
}
Is this a reasonable solution? (I am aware the actual code above won't work because I haven't properly created the jsonslurper)
Hi all, I am getting a java.nio.file.ProviderMismatchException
when I run the following script:
process a {
output:
file _biosample_id optional true into biosample_id
script:
"""
touch _biosample_id
"""
}
process b {
input:
file _biosample_id from biosample_id.ifEmpty{file("_biosample_id")}
script:
def biosample_id_option = _biosample_id.isEmpty() ? '' : "--biosample_id \$(cat _biosample_id)"
"""
echo \$(cat ${_biosample_id})
"""
}
i'm using a slightly modified version of Optional Input pattern.
Any ideas on why I'm getting the java.nio.file.ProviderMismatchException
?
workflow mywf {
take:
data_dir
main:
task1(data_dir)
task2(data_dir) // should wait for task1 to complete before starting
}
.collect()
ends up generating more than 9k symlinks files in the same folder, for each sample. Is there any way to collect them separately ?
b
3 times on the output of a
:nextflow.enable.dsl = 2
process a {
input:
val x
output:
val y
exec:
y = x.toUpperCase()
}
process b {
input:
val x
val n
output:
val y
exec:
y = "$x$n"
}
workflow {
x = channel.value('a')
n = channel.of(1..3)
// I know these lines would work.
//p = a(x)
//b(p, n) | collect | view
// Is there any way to do it all in one pipeline?
a(x) | b(???, n) | collect | view
}
I want to get a channel of the form:
['lib1', 'species1']
['lib1, 'species2']
['lib2', 'species3']
My process is parsing a kraken2 report text file to find any species present present above some threshold per lib:
process select_species {
input:
tuple val(library_id), path(kraken_report)
val(threshold)
output:
tuple val("${library_id}"), stdout , emit: species_list
script:
"""
awk '\$1>${threshold} && \$4=="S" { print \$NF }' ${kraken_report} | grep -v sapiens
"""
}
This gives me a tuple that contains newlines, but I feel like I'm only 1 magic nextflow command away from getting my desired output.
Current output:
[P01900, coli
ananatis
oryzae
acnes
barophilus
VB_PmiS-Isfahan
]
Desired output:
[P01900, coli]
[P01900, ananatis]
[P01900,barophilus]
...