Nextflow community chat moved to Slack! https://www.nextflow.io/blog/2022/nextflow-is-moving-to-slack.html
pditommaso on master
Add requirements for container … (compare)
nextflow.file.FileHelper - Can't check if speci
fied path is NFS (1): /mop2-bucket-1/scratch
publishDir
with DSL2. This announcement suggested that there would be improvements here, and I found this that seems to provide some mechanism, but I was assuming that you should be able to define a process that does not need to worry about publishDir
and let the workflow choose which outputs to publish? Is this possible? Any examples?
Is it possible to use a .count()
channel as the size
parameter for GroupTuples
?
I currently have:
bcftools_index_somatic.out.vcfs_w_csis.groupTuple(by: [0, 1, 2, 3], size: extract_chroms_from_bed.out.chroms_list.count()).set{ vcfs_by_patient }
But I get the error:
Value 'DataflowVariable(value=null)' cannot be used in in parameter 'size' for operator 'groupTuple' -- Value don't match: class java.lang.Integer
Is it possible to convert the .count()
channel into something consumable by size:
?
$params
because this gives a comma-separates map, but something that returns a table. Any ideas?
docker.fixOwnership = true
(and install procps
in your image otherwise Nextflow will complain that you don't have ps
installed). Best guess is that the mismatch in the ids for the owner of the image manifest file was preventing it from being accessed. Haven't tried it but a quick peek at the source code suggests that setting NXF_OWNER
will also make things work.
Command error:
Execution failed: generic::failed_precondition: while running "nf-2f69f94540149df9eda94c49022f51cc-main": unexpected exit status 1 was not ignored
nextflow-21.10.5
- any workflow does this including hello-world
gsutil cat gs://mygcpbucket/nextflow/f4/72c513e7a923fb0c80b30fc74c669d/google/logs/output
/bin/bash: /nextflow/f4/72c513e7a923fb0c80b30fc74c669d/.command.log: Permission denied
+ trap 'err=$?; exec 1>&2; gsutil -m -q cp -R /nextflow/f4/72c513e7a923fb0c80b30fc74c669d/.command.log gs://truwl-internal-inputs/nextflow/f4/72c513e7a923fb0c80b30fc74c669d/.command.log || true; [[ $err -gt 0 || $GOOGLE_LAST_EXIT_STATUS -gt 0 || $NXF_DEBUG -gt 0 ]] && { ls -lah /nextflow/f4/72c513e7a923fb0c80b30fc74c669d || true; gsutil -m -q cp -R /google/ gs://truwl-internal-inputs/nextflow/f4/72c513e7a923fb0c80b30fc74c669d; } || rm -rf /nextflow/f4/72c513e7a923fb0c80b30fc74c669d; exit $err' EXIT
+ err=1
+ exec
+ gsutil -m -q cp -R /nextflow/f4/72c513e7a923fb0c80b30fc74c669d/.command.log gs://truwl-internal-inputs/nextflow/f4/72c513e7a923fb0c80b30fc74c669d/.command.log
+ [[ 1 -gt 0 ]]
+ ls -lah /nextflow/f4/72c513e7a923fb0c80b30fc74c669d
total 40K
drwxr-xr-x 3 root root 4.0K Dec 10 23:51 .
drwxr-xr-x 3 root root 4.0K Dec 10 23:51 ..
-rw-r--r-- 1 root root 3.3K Dec 10 23:51 .command.log
-rw-r--r-- 1 root root 5.3K Dec 10 23:51 .command.run
-rw-r--r-- 1 root root 36 Dec 10 23:51 .command.sh
drwx------ 2 root root 16K Dec 10 23:50 lost+found
+ gsutil -m -q cp -R /google/ gs://mygcpbucket/nextflow/f4/72c513e7a923fb0c80b30fc74c669d
I was running a Nextflow job with about 2k tasks on AWS Batch. Unfortunately, the Docker container for one of the processes contained an error (Exception in thread "Thread-1" java.awt.AWTError: Assistive Technology not found: org.GNOME.Accessibility.AtkWrapper
), and I had to kill the nextflow run. I guess I must have hit CTRL+C
twice, because while the interactive nextflow CLI progress stopped, I'm still left with thousands of RUNNABLE jobs in AWS Batch.
Is there any quick way to remove them without potentially affecting other nextflow runs using the same compute queue?
How can I avoid similar issues in the future? I.e. how should I properly cancel a running nextflow run and make it clean up its jobs in Batch?
ch_spectra_summary.map { tuple_summary ->
def key = tuple_summary[0]
def summary_file = tuple_summary[1]
def list_spectra = tuple_summary[1].splitCsv(skip: 1, sep: '\t')
.flatten{it -> it}
.collect()
return tuple(key.toString(), list_spectra)
}
.groupTuple()
.set { ch_spectra_tuple_results}
[supp_info.mzid.gz, [[supp_info.mzid, 2014-06-24, Velos005137.mgf, MGF, Velos005137.mgf, ftp://ftp.ebi.ac.uk/pride-archive/2014/06/PXD001077/Velos005137.mgf]]]
ftp://ftp.ebi.ac.uk/pride-archive/2014/06/PXD001077/Velos005137.mgf
[supp_info.mzid.gz, [ftp://ftp.ebi.ac.uk/pride-archive/2014/06/PXD001077/Velos005137.mgf]]
Hello,
I have the following python script in a nextflow process.
process update_image {
script:
"""
#!/usr/bin/env python3
import os, subprocess
subprocess.check_call(['singularity', 'pull', 'docker://busybox'])
}
The singularity is installed and is in the $PATH. The config file looks like:
singularity {
singularity.enabled = true
singularity.autoMounts = true
}
However, I get the error: No such file or directory: 'singularity'
. Any ideas what might be wrong here?
Hello,
I have the following python script in a nextflow process.
process update_image { script: """ #!/usr/bin/env python3 import os, subprocess subprocess.check_call(['singularity', 'pull', 'docker://busybox']) }
The singularity is installed and is in the $PATH. The config file looks like:
singularity { singularity.enabled = true singularity.autoMounts = true }
However, I get the error:
No such file or directory: 'singularity'
. Any ideas what might be wrong here?
Try the following:
singularity {
enabled = true
autoMounts = true
}
I have a regular expression in my parameters
params {
normal_name = /^NF-.*-3.*/
}
that I use to match in my workflow elsewhere
def split_normal = branchCriteria {
normal: it.name =~ params.normal_name
}
I'm trying to override this parameter as a CLI argument with --normal_name '/^NF-.*-4.*/'
but then it's treated as a string in the workflow instead. Is there a good way to handle this, perhaps by compiling the parameter in the workflow?
Hi, I'm trying to get the sarek pipeline work on our hpc cluster.
here is my command
nextflow run nf-core/sarek -r 2.7.1 -profile singularity -c nfcore_ui_hpc.config --input '/Users/rsompallae/projects/janz_lab_wes/fastqs_1.tsv' --genome mm10
and the error I get is
I feel like this has to do with some parameter adjustment but I'm not sure how to fix it.
Thanks in advance for your help
I would like to use split -l 1
on an input file and then emit each small file out on its own with a tuple maintaining metadata for the initial file that was split, instead of having all of them contained in that one field of the tuple.
something like this:
process split {
input:
tuple path(run), val(plateid), path(file) from ex_ch
output:
tuple path(run), val(plateid), path("file_*") into parse_ch
script:
"""
tail -n +17 $file > sample_lines.csv
split -l 1 -d sample_lines.csv file_
"""
}
But this should ideally emit the number of lines in sample_lines.csv as tasks. With this setup they're all caught into a single channel and you get a tuple that looks like:
['/path/to/run', 'plate_id', 'file_1, file_2, file_3, file_4']
Anyone have a quick way to do this? Maybe it's just a .multimap
/ .map
but I can't seem to get it right.
.splitCsv
operator. Now this works nicely when running this locally or on a cluster, but now I tried to move this to AWS S3. Is there a way to retreive the s3 directory location where the files are stored and put it in the CSV file? Or how would I approach this?
file "*.rds" into group_chunks_list
. I tried then fromList
but without luck. I am not sure if I am missing smth.
Error: Could not find or load main class run
i sometimes get this error immediately when running nextflow, and no files are produced by nextflow. what does this error actually mean? i can see that the .nf script i am calling is accessible and the nextflow executable is also accessible.
file()
a path to gs://
? It's not liking it at the moment, and I'm guessing it might be user error. Works fine for file()
in a process
gs