Nextflow community chat moved to Slack! https://www.nextflow.io/blog/2022/nextflow-is-moving-to-slack.html
jorgeaguileraseqera on 2150-staging-s3-folder-is-painfully-slow-compared-to-aws-cli
feature: accelerate s3 download… (compare)
jorgeaguileraseqera on 2150-staging-s3-folder-is-painfully-slow-compared-to-aws-cli
feature: accelerate s3 download… (compare)
*.fastq.gz
and I can read in the sample sheet, but I'd need to match sample sheet names and filename. Is there some way to use something like the.fromFilePairs
factory on an existing channel?
nextflow.enable.dsl=2
process createFiles {
output:
path("*.txt", emit: apath)
script:
"""
#!/usr/bin/env python
filenames = ['a.txt', 'b.txt', 'c.txt']
for f in filenames:
with open(f, "w") as wf:
wf.write("hello\\n")
print(f)
"""
}
process printContent {
input:
path(x)
script:
"""
cat $x
"""
}
workflow {
createFiles()
printContent(createFiles.out.apath)
}
executor > local (2)
[67/0c496d] process > createFiles [100%] 1 of 1 ✔
[1b/36a13c] process > printContent [100%] 1 of 1 ✔
workflow {
createFiles()
printContent(createFiles.out.apath.flatten())
}
Is it possible to create files with the native execution mode of a process? For example, I attempted the following:
process WRITE_FASTP_METRICS{
input:
val (rna_result)
val (adt_result)
output:
path "fastp_metrics.csv"
exec:
write_out = file("fastp_metrics.csv")
rna_result.forEach{key, value ->
write_out << key << ',' << value << '\n'
}
adt_result.forEach{key, value ->
write_out << key << ',' << value << '\n'
}
}
But the fastp_metrics.csv
is not created in the work directory, causing this error: Missing output file(s) ``fastp_metrics.csv`` expected by process ``WRITE_FASTP_METRICS (1)``
i'm wondering if we can pass in a container as a variable, as i want to test the same process over various versions of a software. something like this:
process A {
container= container_label
input:
tuple val(container_label), path(inputFile)
...
}
this code did not work, however. can it be done in another way?
process A {
container= params.container_label
input:
path(inputFile)
...
}
process A {
input:
tuple val(container_label), path(inputFile)
...
script:
task.container = container_label
...
}
process generate_readset {
tag "$sample_id"
cpus 48
input:
tuple val(read_name), val(chromosome1), val(chromosome2), val(cuteSV_pos1), val(cuteSV_pos2),
val(sniffle_pos1), val(sniffle_pos2),
path(cuteSV_vcf), path(sniffles_vcf) from vcf_input
output:
path 'complete_read_set.txt' into receiver
script:
"""
${bcftools_1_11} view --threads ${task.cpus} $cuteSV_vcf -r chr$chromosome1:$cuteSV_pos1-$cuteSV_pos2 > complete.txt
"""}
Remote resource not found: https://api.github.com/repos/PATH/TO/contents/main.nf
. What am I doing wrong?
Hey All,
I have a error relate to nextflow azurebatch. The first process using a default D4_v3 vm work alright, but the second process I fail to request a larger vm (I set it in queue, but apparently, it is not working, do I make some naive mistake?)
'''
Error executing process > 'secondprocess'
Caused by:
Cannot find a VM for task 'secondprocess' matching this requirements: type=Standard_D4_v3, cpus=16, mem=14 GB, location=eastus
'''
The config file I used:
process {
executor = 'azurebatch'
}
docker {
enabled = true
}
azure {
batch {
location = 'eastus'
accountName = 'xxxbatch'
accountKey = 'xxx'
autoPoolMode = true
allowPoolCreation = true
deletePoolsOnCompletion = true
deleteJobsOnCompletion = true
pools {
small {
autoScale = true
vmType = 'Standard_D4_v3'
vmCount = 5
maxVmCount = 50
}
large {
autoScale = true
vmType = 'Standard_D16_v3'
vmCount = 5
maxVmCount = 50
}
}
}
storage {
accountName = "xxx"
accountKey = "xxx"
}
}
process {
withName: firstprocess {
queue = 'small'
}
withName: secondprocess {
queue = 'large'
}
}
nextflow run
, despite specifying an unquota'd path with -w
aka -work-dir
. Any ideas? In this thread, Paolo suggests -w
is the solution… https://groups.google.com/g/nextflow/c/401Tp_6H57k/m/va8ACNeTAQAJ
$ nextflow run nf-core/viralrecon -w /users/xxx/test --help
N E X T F L O W ~ version 21.04.0
Pulling nf-core/viralrecon ...
Disk quota exceeded
How can I use this as the 'size' part of a groupTuple? I tried:
aligned_bams
.groupTuple(by:0,size:lane_calc)
But it did not like it - complained about the value type etc. All thoughts gladly received!