Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
  • 09:05
    honestAnt commented #468
  • 09:04
    honestAnt commented #468
  • 07:50
    Lehmann-Fabian commented #3202
  • 07:50
    pditommaso labeled #3240
  • 07:00
    pditommaso commented #3249
  • 05:36
    jemunro commented #3243
  • Sep 28 21:01

    pditommaso on testing

    Add support for refresh token t… Merge branch 'master' into cont… Add content type Signed-off-by… (compare)

  • Sep 28 20:47
    bentsherman labeled #3250
  • Sep 28 20:39
    bentsherman commented #3243
  • Sep 28 15:36
    bentsherman commented #3202
  • Sep 28 15:32
    bentsherman labeled #2622
  • Sep 28 15:17
    bentsherman labeled #2489
  • Sep 28 13:54
    bentsherman labeled #3251
  • Sep 28 13:36
    bentsherman closed #2840
  • Sep 28 13:36
    bentsherman commented #2840
  • Sep 28 10:46
    jkh1 commented #1103
  • Sep 28 10:22

    pditommaso on v22.09.7-edge


  • Sep 28 09:39

    pditommaso on master

    Bump nf-amazon@1.10.7 Signed-o… Update changelog Signed-off-by… [release 22.09.7-edge] Update t… (compare)

  • Sep 28 09:13
    sonatype-lift[bot] commented #2489
  • Sep 28 09:11
    sonatype-lift[bot] commented #2622
Luca Cozzuto
or maybe you can do a nextflow of nextflows...
Cagatay Aydin
I'm not sure, but that might not be possible in our case, because the workers consume messages from another server, prepare the arguments for the nextflow command, and then call nextflow with those arguments
Luca Cozzuto
but then you have an orchestrator that is not working as an orchestrator
nextflow should submit the jobs to your server, this is what is for
Steven P. Vensko II

I've got a curious issue -- I am running Nextflow on a cluster that I typically do not use. many of my processes are getting errors like the following:

[b8/551435] NOTE: Process `lens:manifest_to_dna_procd_fqs:trim_galore (VanAllen_antiCTLA4_2015/p017/ad-770067)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (1)

Yet, if I go to the work directory, the process is clearly still running:

(base) [spvensko@longleaf-login4 5cb96b7bf20f816c13f22f8f0e3b08]$ realpath .
(base) [spvensko@longleaf-login4 5cb96b7bf20f816c13f22f8f0e3b08]$ ls -lhdrt *
lrwxrwxrwx 1 spvensko users   75 Oct 21 13:11 VanAllen_antiCTLA4_2015-p013-nd-780020_1.fastq.gz -> /pine/scr/s/p/spvensko/fastqs/VanAllen_antiCTLA4_2015/SRR2780020_1.fastq.gz
lrwxrwxrwx 1 spvensko users   75 Oct 21 13:11 VanAllen_antiCTLA4_2015-p013-nd-780020_2.fastq.gz -> /pine/scr/s/p/spvensko/fastqs/VanAllen_antiCTLA4_2015/SRR2780020_2.fastq.gz
-rw-r--r-- 1 spvensko users 3.8G Oct 21 13:23 VanAllen_antiCTLA4_2015-p013-nd-780020_1_trimmed.fq.gz
-rw-r--r-- 1 spvensko users 3.3K Oct 21 13:23 VanAllen_antiCTLA4_2015-p013-nd-780020_1.fastq.gz_trimming_report.txt
-rw-r--r-- 1 spvensko users  630 Oct 21 13:23 VanAllen_antiCTLA4_2015-p013-nd-780020_2.fastq.gz_trimming_report.txt
-rw-r--r-- 1 spvensko users 2.0G Oct 21 13:30 VanAllen_antiCTLA4_2015-p013-nd-780020_2_trimmed.fq.gz

Anyone seen this behavior before?

1 reply
Is there a way to count the number of elements in a channel and exit if there aren't enough files?
I'm looking for something similar to .ifEmpty(), but instead of empty, I want it to do something if there are less than 5 values in it.

-[nf-core/cutandrun] Pipeline completed with errors-
Error executing process > 'NFCORE_CUTANDRUN:CUTANDRUN:PREPARE_GENOME:GUNZIP_GTF (Sus_scrofa.Sscrofa11.1.104.gtf.gz)'

Caused by:
Process NFCORE_CUTANDRUN:CUTANDRUN:PREPARE_GENOME:GUNZIP_GTF (Sus_scrofa.Sscrofa11.1.104.gtf.gz) terminated with an error exit status (126)

Command executed:

gunzip -f Sus_scrofa.Sscrofa11.1.104.gtf.gz
echo $(gunzip --version 2>&1) | sed 's/^.(gzip) //; s/ Copyright.$//' > gunzip.version.txt

Command exit status:

Command output:

Command error:
docker: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/create?name=nxf-4TCCNGoQeNL7feEvydV4IPwf: dial unix /var/run/docker.sock: connect: permission denied.
See 'docker run --help'.

Work dir:

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

3 replies

Hi all,
I'm running into something strange. Until now, my pipeline used to produce this on stdout by the end:

Completed at: 22-Oct-2021 09:35:45
Duration    : 1m 26s
CPU hours   : (a few seconds)
Succeeded   : 8

I'm just changing a script (actually merging 2 scripts into 1) in a process and so changing the call to this script and the stdout report disappears...
Any help on that?
I actually discovered that I don't know which part of my code is producing this output... (tried the very first example of the documentation and it doesn't produce the stdout report neither)

Patrick Blaney
Hi Everyone,
I was wondering how splitText(by: 50, file: true) would treat the remainder if the input file being split is not an even multiple of 50. Would it simple create a file with the remaining number of lines?
Steven P. Vensko II
Can someone please sanity check me real fast -- Is it possible to use both Singularity and Docker in a Nextflow script (for different processes)?
1 reply
Alex Mestiashvili
a docker image can be converted to singularity image
5 replies
Raoul J.P. Bonnal
@pditommaso I am trying to create a new plugin. I am following the nf-hello and I can compileGroovy but when I run the launch I got a java.lang.NoClassDefFoundError for the dependency I need. I've added to the dependencies to nf-my_plugin/plugins/nf-my_plugin/build.gradle as implementation ... and gradle downloaded them in the ~/.gradle cache directory. I am missing something for sure.
4 replies
Hi Everyone,
I try to put process.cleanup = true in the config file as suggested here https://github.com/nextflow-io/nextflow/pull/2135/files but it does not seems to work (using nextflow-21.09.0-edge-all). I definitively does not understand how to use this option. Can somebody help me on this one ? Thank you in advance.
2 replies
Arijit Panda

Hi Everyone,
I am new to nextflow. So can someone help me to write nextflow code that processes multiple samples with multiple paired fastq files. Here is link my query that I posted in stack overflow. https://stackoverflow.com/questions/69702077/nextflow-how-to-process-multiple-samples .

I can process single sample at once but not all. Here is my code for processing single sample.

params.fastq_path = "data/${params.sampleName}/*{1,2}.fq.gz"

fastq_files = Channel.fromFilePairs(params.fastq_path)

params.ref = "ab.fa"
ref = file(params.ref)

process foo {
    set pairId, file(reads) from fastq_files


    file("${pairId}.bam") into bamFiles_ch

    echo ${reads[0].toRealPath().getParent().baseName}
    bwa-mem2 mem -t 8 ${ref} ${reads[0].toRealPath()} ${reads[1].toRealPath()} | samtools sort -@8 -o ${pairId}.bam
    samtools index -@8 ${pairId}.bam

process samToolsMerge {
    publishDir "./aligned_minimap/", mode: 'copy', overwrite: 'false'

    file bamFile from bamFiles_ch.collect()


    samtools merge ${runString}.bam ${bamFile}
    samtools index -@ ${bamFile}
James Mathews

The channel.fromPath function seems to accept only bonafide path strings or glob patterns.
How can one make a channel from a path string which is itself only known by looking inside another file?

process retrieve_another_filename {
    path metadata_file

    stdout emit: other_filename

    cat $metadata_file

process show_file_contents {
    path other_filename

    stdout emit: contents

    cat $other_filename

workflow {
    metadata_file_ch = channel.fromPath('metadata.txt')

    // another_file_ch = channel.fromPath(retrieve_another_filename.out. ... ) ?


(Note: Trying to be DSL2-compliant.)

2 replies
Has anyone seen nextflow pipelines (potentially more frequently a problem in ones that run a lot or on a cron job) throw errors with the JDK? I see hs_err_pidxxxx.log files show up in the directories where I run them on a slurm cluster describing segfault/sigbus errors. I found this mentioned in #842 and tried changing the flag and sometimes it works to resolve the issue (but potentially at the cost of ruining resuming which I would like to retain), but sometimes it just changes the nature of the error slightly. Any input on this would be great. I'll post the full error in a thread reply to this so it doesn't take up all the space on the page...[repost, but still haven't found any solution]
3 replies
Adam Price

I am trying to do something fairly basic, but can't work out the groovy/nextflow-ish way to do it. Fairly new to nextflow/groovy.

I have two channels that look like this:

samples = ["sample1_a", "sample1_b" "sample2_a", "sample2_b", "sample3_a"]
pairs = [ ["sample1_a", "sample1_b"] , ["sample2_a", "sample2_b"], ["sample3_a", "sample3_b"] ]`

I want to select from samples the ones that have both matching pairs. So in a pseudocode logic, i want to accomplish this:

validList = []
for (i in pairs.size()):
if (pairs[i][0] in samples && pairs[i][1] in samples)

I feel like there should be a pretty straightforward way to use .map or something to accomplish this. Any ideas?

1 reply
Somak Chowdhury
nextflow runs the test profile using docker but when running with samples specified in a csv file it says no input file specified. Can some one suggest what is going wrong?
Benjamin Wingfield

I want to run mawk -f hello_world.awk , and hello_world.awk lives in bin/. What's the nextflowy-est way to do this? (I know it seems weird)

I want to explicitly call mawk for portability (the binary can live in different places) and to use extra arguments like -v. Shebangs aren't portable. Env won't take multiple parameters unless I'm over or under thinking things. At the minute I'm doing some weird shell magic

W. Lee Pang, PhD
Is there a way to get a workflow's session hash before it is run?
Hi folks, can anyone explain to me why the following snippet won't print out the exit msg to the console? Basically, I have to comment out the TEST_MODULE() line, otherwise the "Error msg" won't be displayed to console.
nextflow.enable.dsl = 2

workflow {
    process TEST_MODULE {
      echo "test" > test.txt

    exit 1, "Error msg"
Greg Gavelis Code Portfolio
Hi there, When troubleshooting caching issues--does anyone know how to parse the outputs of the -dump-hashes option?
(I'm trying to understand why only a minority of my outputs cache, even though they run successfully. )
The tips on the 'troubleshooting nextflow resume' page suggest running nextflow twice with the -dump-hashes option, then comparing the differences between the logfiles. But what are we looking for exactly? I do not see errors in either file.
First logfile: https://pastebin.com/MgPMWLxX
Second logfile: https://pastebin.com/XgBz3pHs
9 replies
Hi everyone,
This is more of a general question but does anyone have any tips or recommendation on when you want to create and maintain a directory structure from the start of the script to the end?
For example, I am writing a script that fetches sequence files from database, run clustering on them, create directory for each cluster and have sequences in them, and finally annotate each files.
I don't know if it's ideal to input and output the entire base directory for every process in the script. Also, when I use the qualifier "each", the directory structure is lost and I would like to know how everyone is working around this.
2 replies
Paul Cantalupo
where does the nextflow log value called REVISION ID come from? How is it calculated?

With DSL2 is there a way to make staging of files from S3 lazy? When I was working with vanilla nextflow I could get it to delay staging files until the process starts executing by taking a channel that contains S3 path strings and mapping it to file() in the input block of the process.

process example {
  file('input.txt') from s3pathStrings.map{ file(it) }

with DSL2 it now looks like this:

data = s3pathStrings.map{ file(it) }

process example {


In the first version (before DSL2) nextflow would schedule some concurrent tasks, and as those tasks execute it would trigger the staging of the remote files they needed for execution. In the DSL2 version all the files are staged in the top level scope, before any "example" tasks start executing.

Since from isn't part of DSL2 it doesn't seem that it is possible to use this trick any more. Is there another way to do this with DSL2? It was nice to have this behavior because if there are a lot of files to stage at the same time it sometimes causes s3 connections to time out. It was also a little nicer than limiting the parallel transfers because it helps prioritize which files to stage first to enable the first tasks to start executing more quickly.


Hi all, new to next flow here. Two related questions:

I want the following in my config file:

process {
  publishDir {
    path = '.'
    mode = 'link'
    enabled = { task.ext.publish }

So that I can turn on publication in my processes by just setting task.ext.publish to true or false, depending. (This way I don't have to respecify the path or link each time)

However, there are two issues: in the config file, it seems that task.ext.publish is interpreted as process.publishDir.task.ext.publish which isn't what I want. Also, I can't seem to set ext.publish from my process, the period in ext.publish true seems to be screwing things up.

Any advice? Thanks!


Hi! When I try to run nf-core on my local PC after a long time, I get the following error. Does anyone know the cause?

nextflow run nf-core/rnaseq -profile test,docker -r 3.3

N E X T F L O W  ~  version 21.09.1-edge
Launching `nf-core/rnaseq` [disturbed_meucci] - revision: 8094c42add [3.3]

ERROR: Validation of pipeline parameters failed!

* --hostnames: expected type: String, found: JSONObject ({"cfc":[".hpc.uni-tuebingen.de"],"utd_sysbio":["sysbio.utdallas.edu"],"utd_ganymede":["ganymede.utdallas.edu"],"genouest":[".genouest.org"],"cbe":[".cbe.vbc.ac.at"],"genotoul":[".genologin1.toulouse.inra.fr",".genologin2.toulouse.inra.fr"],"crick":[".thecrick.org"],"uppmax":[".uppmax.uu.se"],"icr_davros":[".davros.compute.estate"],"imperial":[".hpc.ic.ac.uk"],"binac":[".binac.uni-tuebingen.de"],"imperial_mb":[".hpc.ic.ac.uk"]})
By reinstalling Nextflow, the error no longer occurs. It may have something to do with the fact that I recently updated Ubuntu to 21.10.
I could not figure out how to uninstall Nextflow. So I simply ran the following command one more time. curl -s https://get.nextflow.io | bash Then I moved the created nextflow file to the path. Now it works properly.
@kojix2 can you come over to the nf-core slack. I think I remember some discussion about this issue
Nevermind, found the related issue: nf-core/tools#1304
1 reply
Jeffrey Massung

Is it possible for me to access the params such that I can write them as an output? In my particular workflow, several of the params used are things like "password needed to unzip file", which I'd like to save for posterity in the output location. I'm basically trying to do:

params_json = new JsonBuilder(params).toString()

process xxx {
        '''echo !{params_json} > params.json'''

But it's not letting me. Is there some other nice way for me to do this?

9 replies
I've been using this as boilerplate code to find reads in the current directory. I would like to make it less verbose but that would need extended globbing and my initial tests have been unsuccessful even with all possible scaping sequences trying +([0-9]). How do you all deal with this?
                        "*_S[0-9][0-9][0-9][0-9]_L00[1-9]_{R1,R2}_001.fastq.gz"]).set { illumina_q }
Steven P. Vensko II
Does nextflow clean -before <desired_run> delete the CACHED directories it utilized (if they are from an earlier run prior to <desired_run>?

Hey all. I have trouble cat two files together under shell. Outside of nextflow run, everything work flawlessly.
echo "read1: NA12878_S1_L001_R1_001.fastq.gz NA12878_S1_L002_R1_001.fastq.gz"
echo "read2: NA12878_S1_L001_R2_001.fastq.gz NA12878_S1_L002_R2_001.fastq.gz"
echo $(ls)
cat NA12878_S1_L001_R1_001.fastq.gz NA12878_S1_L002_R1_001.fastq.gz > read1.fastq.gz
cat NA12878_S1_L001_R2_001.fastq.gz NA12878_S1_L002_R2_001.fastq.gz > read2.fastq.gz

Command exit status:

Command output:
read1: NA12878_S1_L001_R1_001.fastq.gz NA12878_S1_L002_R1_001.fastq.gz
read2: NA12878_S1_L001_R2_001.fastq.gz NA12878_S1_L002_R2_001.fastq.gz
NA12878_S1_L001_R1_001.fastq.gz NA12878_S1_L001_R2_001.fastq.gz NA12878_S1_L002_R1_001.fastq.gz NA12878_S1_L002_R2_001.fastq.gz

Command error:
cat: NA12878_S1_L001_R1_001.fastq.gz: No such file or directory
cat: NA12878_S1_L002_R1_001.fastq.gz: No such file or directory

13 replies
Hello all,
Does anyone know if it is possible to mix more than one language in the script block? E.g. run a specific bash command first, then run the rest with Python? Thanks!
2 replies
Pavel Borobov

Hi all, How much does it cost to run a workflow using tower and aws-batch + fsx lustre? I'm working on a customer project right now and want to evaluate this option. Perhaps some guidance to the documentation and pricing model is needed.
Using aws-batch mode with s3 only work directory is not very efficient.

What is the commercial license price

Steffen Fehrmann
:point_up: August 11, 2019 12:50 PM
Hi all, I'm looking for a method to convert paired end data from bcl2fastq process output into fastq file pair tuple. @happykhan's method is the only I found. Is there some more elegant way with DSL2? Currently I have a standard bcl2fastq process that emits *.fastq.gzand I can read in the sample sheet, but I'd need to match sample sheet names and filename. Is there some way to use something like the.fromFilePairs factory on an existing channel?
Ernesto Lowy
Hi, I've a question on a process using a Python block and emitting files in DSL2.
This is the workflow:
process createFiles {
    path("*.txt", emit: apath)

    #!/usr/bin/env python

    filenames = ['a.txt', 'b.txt', 'c.txt']

    for f in filenames:
        with open(f, "w") as wf:
process printContent {

    cat $x

workflow {
The first process (createFiles) creates the files and emit a channel with the paths of the created files using a Python block and the second (printContent) prints the contents of each file.
My question is:
When I run this workflow. I see the following information from NF:
executor >  local (2)
[67/0c496d] process > createFiles  [100%] 1 of 1 ✔
[1b/36a13c] process > printContent [100%] 1 of 1
Ernesto Lowy
And it seems that the three files generated by createFiles are emitted in a single channel and all files are taken together by printContent instead of being analysed in an independent printContent job each.
7 replies
Can you let me know how to analyse independently each of the files?
Luca Cozzuto
Hi @elowy01 :) you can either make the file names as a channel outside your block of code as suggested by @pcantalupo or use the flatten operator
Paul Cantalupo
@lucacozzuto where do you add flatten? Can you post the code? thank you