splitFastq
to split the read pairs, there was a similar question a few weeks ago
hi, I have this defined array of input files which I want to add as space separated list do some GATK call
knownIndels = ["Mills_and_1000G_gold_standard.indels.hg38.vcf.gz","Homo_sapiens_assembly38.known_indels.vcf.gz"]
further down I wanna make this call
java -jar /biosw/generic-x86_64/gatk/3.11/GenomeAnalysisTK.jar \
-T RealignerTargetCreator \
-R !{genome} \
-I !{name}.snt.bam \
-known !{knownIndels} \
-o !{name}.target_intervals.list \
-nt !{params.threads}
Now the knownIndels variable should simply place the two files separated by a space - as it happens with e.g. the fromFilePairs Channel.
What actually happens is that just the array as is is put into the command call (from .command.sh):
java -jar /biosw/generic-x86_64/gatk/3.11/GenomeAnalysisTK.jar -T RealignerTargetCreator -R Homo_sapiens_assembly38.fasta -I test.snt.bam -known [Mills_and_1000G_gold_standard.indels.hg38.vcf.gz, Homo_sapiens_assembly38.known_indels.vcf.gz] -o test.target_intervals.list -nt 10
Anyone knows who to resolve this?
known = knownIndels.collect{"-known $it"}.join(' ')
when
clause in a process definition cannot cope with sets?
process stack{
errorStrategy 'ignore'
validExitStatus 0, 255, 127//Ignore warning of reference outside
publishDir "${params.results}/stacked/${stack_name}/${n_stack}/${used_method}"
input:
set file(unw_ls:"*.diff"), file(off_ls:"*.off"), val(master_id), val(slave_id), val(bl), val(method) from to_stack
each ref_pt from ref_pix_stack//repeat it continously becuase ref_pix_stack is only sent once
each ref_mli from ref_mli_stack
each n_stack from 5..unw_ls
when:
(unw_ls as List).size() < n_stack
output:
set file(off_par), file(rate_m), file(sig_rate_m), file(sig_ph),
val(stack_id), val(used_method), val(n_stack), val(av_time) into stacked
set file('rate.bmp'), file('rate_std.bmp'), file('ph_std.bmp') into rate_ras
shell:
ERROR ~ No such variable: unw_ls
file(unw_ls:"*.diff")
ERROR ~ No such variable: pair_id
for the output line.Channel
.fromFilePairs( params.reads )
.ifEmpty { error "Cannot find any reads matching: ${params.reads}" }
.set { read_pairs }
process bbduk {
publishDir "./result/bbduk"
input:
set pair_id, file(reads) from read_pairs
output:
set pair_id, file "*_R2.clip.fastq.gz" into cleaned_reads
"""
~/programs/bbmap/bbduk.sh -Xmx2000m t=6 in1=${reads[0]} in2=${reads[1]} out1=${pair_id}_R1.clip.fastq.gz out2=${pair_id}_R2.clip.fastq.gz ref=~/programs/bbmap/resources/adapters.fa ref=~/programs/bbmap/resources/phix174_ill.ref.fa.gz ktrim=r k=23 mink=11 hdist=1 tpe tbo qtrim=r trimq=10 maq=10
"""
}
file
actually isn't there in the example... https://www.nextflow.io/example4.html
shelll`` and I have some groovy code before the command, that code is executed even when the
when``` condition for the process is not ture
0.25.6-SNAPSHOT
fixing this
java -jar /biosw/generic-x86_64/gatk/3.11/GenomeAnalysisTK.jar \
-T RealignerTargetCreator \
-R !{genome} \
-I !{name}.snt.bam \
!{knownIndels.collect{"-known $it"}.join(' ')} \
-o !{name}.target_intervals.list \
-nt !{params.threads}
Is there a way to combine the 'retry' and 'finish' in the errorStrategy directive?
Ideally after the final retry, I would like a graceful exit, completing the the running jobs.
process.publishDir
. However, that happens almost at the beginning of my main.nf, and at this point I am not in a process scope, so I can't access it. Is there a work around for this? My colleague ( @fmorency ) seems to remember something, but is now on vacation... Alternatively, if there is a clean way to do so, I would gladly take it :)
errorStrategy { task.attempt < X ? 'retry' : 'finish' }
publishDir
eg
params.outdir = '/some/path'
println "My out dir: $params.outdir"
:
process foo {
publishDir params.outdir
:
}