srr_ch = Channel.from( "SRR519926", "SRR1553607")
process fastqdump {
container 'quay.io/biocontainers/parallel-fastq-dump:0.6.5--py_0'
input:
each srr from srr_ch
output:
file("*.fastq") into fastq_ch
"""
prefetch ${srr} && parallel-fastq-dump -t 8 -X 10000 --split-files -s ${srr}
"""
}
Channel `srr_ch` has been used twice as an input by process `fastqdump` and another operator
srr_ch = Channel.from( "SRR519926", "SRR1553607")
srr_ch.println()
process fastqdump {
container 'quay.io/biocontainers/parallel-fastq-dump:0.6.5--py_0'
input:
each srr from srr_ch
output:
file("*.fastq") into fastq_ch
"""
prefetch ${srr} && parallel-fastq-dump -t 8 -X 10000 --split-files -s ${srr}
"""
}
Hi @pditommaso and all. I'm running a nextflow pipeline for realigning bam files. I have 4000+ samples, and ~2600 of these have completed. Running nextflow -resume main.nf
crashed our interactive server, which has 192 CPUs and 10T RAM.
The admins told me that system-wide CPU use from nextflow processes was over %1000. I am using sge as executor and the crash seems to happen when nextflow is reading which samples are cached. Here's output when system became unstable:
executor > sge (3)
[6e/65371e] process > speedyseq (EGAR00001156495_10698_5_7_cram) [100%] 2613 of 2616, cached: 2613
https://raw.githubusercontent.com/maxulysse/test-datasets/sarek/file{1,2}.ext
become two files: https://raw.githubusercontent.com/maxulysse/test-datasets/sarek/file1.ext
https://raw.githubusercontent.com/maxulysse/test-datasets/sarek/file2.ext
as it is with a regular path?