Nextflow community chat moved to Slack! https://www.nextflow.io/blog/2022/nextflow-is-moving-to-slack.html
pditommaso on master
Improve logging Signed-off-by:… Refactor wave packing [ci fast]… (compare)
test.csv
expected by process". If I cd to the NF workdir /rdsgpfs/general/ephemeral/user/ck/ephemeral/TestNF/work/f2/aca9181e283b109ffe55dc5e73d66a I can see the test.csv was produced. The file is saved to the workdir in R using write.table(df, "test.csv") )
I've just started playing around with DSL-2 and I'm trying to pass the output of a process into a new, named channel. I need to join this output channel with another channel further down the workflow, hence chaining opperators directly from the process call doesn't work. I have a script that does something like this:
process1(parameters)
outputChannel = process1.out
.ifEmpty {
error "Stuff not produced"
}
.map { <do something> }
But I get the error:
nextflow.Session - Session aborted -- Cause: No signature of method: nextflow.script.ChannelArrayList.ifEmpty() is applicable for argument types: (Script_48650d62$_runScript_closure7$_closure12) values: [Script_48650d62$_runScript_closure7$_closure12@749f539e]
What am I doing wrong?
@taylor.f_gitlab
Hi all, pretty sure I've scoured the docs with no results, but is there any syntax for have a file object work similar to a non-consumable value? For instance, a reference fasta that is getting used multiple times throughout a pipeline. Is there no better way than using .fromPath() each time?
Channel.fromPath('genome.fa').into { ref_fasta1; ref_fasta2; ref_fasta3, .... etc. }
I think in the new DSL2 for Nextflow you no longer have to do this, you can just set
it once and use it repeatedly.
Could someone help me to figure out the problem? My code is like this:
pair = [:]
outChannel = Channel.create()
inChannel.subscribe onNext: {
if(pair.containsKey(it)) {
outChannel.bind(it)
}
else {
pair[it] = null
}
}
onComplete: {
outChannel.close()
}
outChannel.subscribe { println "$it" }
I have checked that values were correctly bound to outChannel but no output from outChannel.subscribe. Anything Wrong?
hi everyone.
I've been trying to set up a chain of processes; bcl2fastq > fastp > multiqc.
ive been going over the docs and i cant seem to figure out; how do i scoop out the demultiplexted reads from bcl2fastq, organize it in to read pairs and then pipe into another process.
fastq_output.flatMap().map{ file ->
if ("${file}".contains("_R1_") || "${file}".contains("_R2_") ){
def key_match = file.name.toString() =~ /(.+)_R\d+_001\.fastq\.gz/
def key = key_match[0][1]
return tuple(key, file)
}
}
.groupTuple()
.into{ read_files_fastqc; read_files_fastp}
Figured it out I guess, made sense to use flatmap to tidy up the reads ...
@happykhan
how do i scoop out the demultiplexted reads from bcl2fastq, organize it in to read pairs
I do not do this inside Nextflow, I separate my demultiplexing pipeline from the rest of my analysis. I run a script on the demultiplexing output to coordinate the sample R1 R2 pairs into a new samplesheet as the input for the analysis pipeline.
Demultiplexing pipeline: https://github.com/NYU-Molecular-Pathology/demux-nf
downstream analysis pipeline: https://github.com/NYU-Molecular-Pathology/NGS580-nf
samplesheet generation (parsing of the R1 R2 pairs) happens here:
https://github.com/NYU-Molecular-Pathology/NGS580-nf/blob/4986e0a6a5eb9fec3e5016c8de29b60d5044df96/Makefile#L170
using this script:
https://github.com/NYU-Molecular-Pathology/NGS580-nf/blob/4986e0a6a5eb9fec3e5016c8de29b60d5044df96/generate-samplesheets.py
if you wanted to do it all inside one pipeline, then might want to use some of the functions of that script somehow to do the SampleID-R1-R2 pairing and then output to Nextflow in a new channel. Or if you are good with regex you might be able to do it natively in a Nextflow channel .map
or something like that.
Hi there, I am having an issue with trying to implement a perl script within my nextflow workflow. Is it possible to call a script within nextflow? or do you have to write the script within the process?
I have tried both ways and have currently had no success.
I am trying to convert a 'stringtieMerged.gtf' gene_id which gives the default output to the mirBase names that I have.
The perl script works outside of nextflow but I then have issues when placing it into the workflow. I have provided the code below.
This is the process that creates the merged list
process createList {
module 'stringtie'
publishDir "$baseDir/../output/stringtieGTF", mode: 'copy'
tag "${listGTF}"
errorStrategy { task.exitStatus == 0 ? 'retry' : 'terminate' }
maxRetries 3
maxErrors -1
input:
file listGTF from listGTF.collect()
file gff from gffFile
output:
file "mergeList.txt" into mergeList
file "stringtieMerged.gtf" into stringtieMerged
script:
"""
touch mergeList.txt
ls -1 $listGTF > mergeList.txt
stringtie --merge -o stringtieMerged.gtf -G ${gff} mergeList.txt
"""
}
This is the perl script that works perfectly fine outside of the workflow but I get either exit status 25 or 2
process swapID {
publishDir "$baseDir/../output/stringtieGTF", mode: 'copy'
input:
file "stringtieMerged.gtf" from stringtieMerged
file gff from gffFile
output:
file "temp.gtf" into stringtieMergedID
script:
"""
#!/usr/bin/env perl
my \$gff = "mmuChr.gff3";
my \$gtf = "stringtieMerged.gtf";
# open GFF3 from mirbase and made a lookup
my %lookup; # key value obejct
open FPIN, "<".${gff} or die; # open file for reading
while (<FPIN>) { # loop over each line in turn
if (/ID\\=([^;]+);.*Name\\=([^;]+)[\\t \\r\\n\\f;]+/) { # if line contain this string + is atleast one match, * zero or some
my (\$id, \$name) = (${1}, ${2});
die if (exists \$lookup{\$id}); # don't really need this, but checks that the values isn't twice in the file
\$lookup{\$id} = \$name;
}
}
close FPIN;
# open GTF and create new temp file with substituted names
open FPIN, "<".\$gtf or die;
open FPOUT, ">temp.gtf" or die; # output to a temp file
while (my \$line = <FPIN>) {
if (\$line =~ /; transcript_id \"(MI[^\"]+)\";/) {
my \$id = ${1};
die \$id unless (exists \$lookup{\$id});
my \$id2 = \$lookup{\$id};
# make substitution
\$line =~ s/gene_id \"[^\"]+\"/gene_id "\$id2"/;
}
print FPOUT \$line;
}
close FPIN;
close FPOUT;
"""
}
process swapID {
publishDir "$baseDir/../output/stringtieGTF", mode: 'copy'
input:
file "stringtieMerged.gtf" from stringtieMerged
file gff from gffFile
output:
file "temp.gtf" into stringtieMergedID
script:
"""
perl script.pl
"""
}
script
block of nextflow. So, you actually don't have to write perl code inside Nextflow, just give the path of your script and gives the right output and input and it should work
Caused by:
Process `swapID` terminated with an error exit status (2)
Command executed:
perl /scratch/c.c1860369/nextFlow/bin/parseme.pl
Command exit status:
2
Command output:
(empty)
Command error:
Died at /scratch/c.c1860369/nextFlow/bin/parseme.pl line 10.