Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Dec 04 22:27
    robsyme commented #3466
  • Dec 04 22:26
    robsyme commented #3466
  • Dec 04 22:02
    robsyme synchronize #3466
  • Dec 04 21:55
    robsyme commented #3466
  • Dec 04 21:48
    robsyme synchronize #3466
  • Dec 04 21:24
    robsyme commented #3466
  • Dec 04 21:21
    robsyme commented #3466
  • Dec 04 21:19
    robsyme commented #3466
  • Dec 04 21:18
    robsyme commented #3466
  • Dec 04 21:18
    robsyme synchronize #3466
  • Dec 04 21:17
    robsyme synchronize #3466
  • Dec 04 21:15
    robsyme commented #3466
  • Dec 04 20:53
    robsyme review_requested #3466
  • Dec 04 20:49
    robsyme commented #3466
  • Dec 04 20:12
    robsyme synchronize #3466
  • Dec 04 20:11
    robsyme synchronize #3466
  • Dec 04 19:39

    pditommaso on testing

    Fix testing Signed-off-by: Pao… (compare)

  • Dec 04 19:33
    robsyme synchronize #3466
  • Dec 04 19:32
    robsyme synchronize #3466
  • Dec 04 19:31
    robsyme synchronize #3466
James Fellows Yates
@jfy133:matrix.org
[m]
Filipe Alves
@FilAlves
@jfy133:matrix.org Why don't you use vscode IDE?
1 reply
James Fellows Yates
@jfy133:matrix.org
[m]
(I do use VSCod(ium), but I'm looking with the console to make very small prototypes to test certain concepts, I'm not actually developing anything in the console)
Filipe Alves
@FilAlves
Try searching how to customise a groovy console.
I found this online https://stackoverflow.com/questions/47893514/how-to-change-the-font-of-groovyconsole
Hope this helps
1 reply
Pablo Riesgo-Ferreiro
@priesgo

Hi people, I stumbled upon an issue for which I am clueless. I have this small R code that I run with Rscript that uses this sequenza library.

Rscript -e 'test <- sequenza::sequenza.extract("${seqz}", verbose = TRUE);'

The above fails when ${seqz} contains the absolute or relative path to a symbolic link, but if it has the "real" path to the file it works. Does someone have a hypothesis for what may be happening?

Nah don't listen to me, this is something else....
Pablo Riesgo-Ferreiro
@priesgo
Same file copied in different locations works or does not work... but consistently the one that works always works... agggh
doesn't seem like a nextflow issue, sorry for the noise
Luca Cozzuto
@lucacozzuto
dear all
sometimes I stumble on a problem with R libraries
using nextflow + singularity
it is like R is looking for libraries in the user space instead of looking at the container
I put in my nextflow.config this but is not helping
env {
    PYTHONNOUSERSITE = 1
    R_PROFILE_USER   = "/.Rprofile"
    R_ENVIRON_USER   = "/.Renviron"
}
any clue?
1 reply
Luca Cozzuto
@lucacozzuto
the script is also running with the parameter --vanilla
Cagatay Aydin
@kmotoko_gitlab
Hello People,
We have parallel workers, which consume messages coming from a server, and then run nextflow run. The problem is, if the number of messages is high, these parallel workers initialize nextflow almost at the same time. This causes the following error to occur in high frequency: Can't lock file: /home/myhomedir/.nextflow/history -- Nextflow needs to run in a file system that supports file locks. I suspect this happens because when one nextflow process puts a lock on the $HOME/nextflow/history, another nextflow process tries to put a lock on the same file, before it is released by the former. Is this something intended? Any ideas how to properly handle this without dirty workarounds?
Luca Cozzuto
@lucacozzuto
sorry are you running nextflow in parallel?
Cagatay Aydin
@kmotoko_gitlab
yes, each worker (12 in total) executes a nextflow process independently, if there are >1 job in queue, they could be in running parallel

I checked the source code, it is raised from here in modules/nextflow/src/main/groovy/nextflow/util/HistoryFile.groovy:

            try {
                while( true ) {
                    lock = fos.getChannel().tryLock()
                    if( lock ) break
                    if( System.currentTimeMillis() - ts < 1_000 )
                        sleep rnd.nextInt(75)
                    else {
                        error = new IllegalStateException("Can't lock file: ${this.absolutePath} -- Nextflow needs to run in a file system that supports file locks")
                        break
                    }
                }
                if( lock ) {
                    return action.call()
                }
            }

The problem is, it tries to lock for a sec (if I'm reading Java correctly) and then quits if it can't. Am I not supposed to run multiple nextflow processes in parallel?

Cagatay Aydin
@kmotoko_gitlab
One workaroud could be to use a different nextflow/history file path for each worker, but apparently it is hardcoded in the same file.
Cagatay Aydin
@kmotoko_gitlab
Also note that the error string is not entirely correct, the filesystem supports filelocks in my case, it is just that the file itself is locked by another nextflow process within a tiny time window
Luca Cozzuto
@lucacozzuto
well I think you are going a bit against the nextflow philosophy here
you should use nextflow to parallelize more than parallelize nextflow
or maybe you can do a nextflow of nextflows...
Cagatay Aydin
@kmotoko_gitlab
I'm not sure, but that might not be possible in our case, because the workers consume messages from another server, prepare the arguments for the nextflow command, and then call nextflow with those arguments
Luca Cozzuto
@lucacozzuto
but then you have an orchestrator that is not working as an orchestrator
nextflow should submit the jobs to your server, this is what is for
Steven P. Vensko II
@spvensko_gitlab

I've got a curious issue -- I am running Nextflow on a cluster that I typically do not use. many of my processes are getting errors like the following:

[b8/551435] NOTE: Process `lens:manifest_to_dna_procd_fqs:trim_galore (VanAllen_antiCTLA4_2015/p017/ad-770067)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (1)

Yet, if I go to the work directory, the process is clearly still running:

(base) [spvensko@longleaf-login4 5cb96b7bf20f816c13f22f8f0e3b08]$ realpath .
/pine/scr/s/p/spvensko/work/bd/5cb96b7bf20f816c13f22f8f0e3b08
(base) [spvensko@longleaf-login4 5cb96b7bf20f816c13f22f8f0e3b08]$ ls -lhdrt *
lrwxrwxrwx 1 spvensko users   75 Oct 21 13:11 VanAllen_antiCTLA4_2015-p013-nd-780020_1.fastq.gz -> /pine/scr/s/p/spvensko/fastqs/VanAllen_antiCTLA4_2015/SRR2780020_1.fastq.gz
lrwxrwxrwx 1 spvensko users   75 Oct 21 13:11 VanAllen_antiCTLA4_2015-p013-nd-780020_2.fastq.gz -> /pine/scr/s/p/spvensko/fastqs/VanAllen_antiCTLA4_2015/SRR2780020_2.fastq.gz
-rw-r--r-- 1 spvensko users 3.8G Oct 21 13:23 VanAllen_antiCTLA4_2015-p013-nd-780020_1_trimmed.fq.gz
-rw-r--r-- 1 spvensko users 3.3K Oct 21 13:23 VanAllen_antiCTLA4_2015-p013-nd-780020_1.fastq.gz_trimming_report.txt
-rw-r--r-- 1 spvensko users  630 Oct 21 13:23 VanAllen_antiCTLA4_2015-p013-nd-780020_2.fastq.gz_trimming_report.txt
-rw-r--r-- 1 spvensko users 2.0G Oct 21 13:30 VanAllen_antiCTLA4_2015-p013-nd-780020_2_trimmed.fq.gz

Anyone seen this behavior before?

1 reply
Young
@erinyoung
Is there a way to count the number of elements in a channel and exit if there aren't enough files?
I'm looking for something similar to .ifEmpty(), but instead of empty, I want it to do something if there are less than 5 values in it.
yinshiyi
@yinshiyi

-[nf-core/cutandrun] Pipeline completed with errors-
Error executing process > 'NFCORE_CUTANDRUN:CUTANDRUN:PREPARE_GENOME:GUNZIP_GTF (Sus_scrofa.Sscrofa11.1.104.gtf.gz)'

Caused by:
Process NFCORE_CUTANDRUN:CUTANDRUN:PREPARE_GENOME:GUNZIP_GTF (Sus_scrofa.Sscrofa11.1.104.gtf.gz) terminated with an error exit status (126)

Command executed:

gunzip -f Sus_scrofa.Sscrofa11.1.104.gtf.gz
echo $(gunzip --version 2>&1) | sed 's/^.(gzip) //; s/ Copyright.$//' > gunzip.version.txt

Command exit status:
126

Command output:
(empty)

Command error:
docker: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/create?name=nxf-4TCCNGoQeNL7feEvydV4IPwf: dial unix /var/run/docker.sock: connect: permission denied.
See 'docker run --help'.

Work dir:
/home/shiyi/mnt/cutandrun/work/28/6b962f871011c8668987a3e93dc4eb

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

3 replies
mbahin
@mbahin

Hi all,
I'm running into something strange. Until now, my pipeline used to produce this on stdout by the end:

Completed at: 22-Oct-2021 09:35:45
Duration    : 1m 26s
CPU hours   : (a few seconds)
Succeeded   : 8

I'm just changing a script (actually merging 2 scripts into 1) in a process and so changing the call to this script and the stdout report disappears...
Any help on that?
I actually discovered that I don't know which part of my code is producing this output... (tried the very first example of the documentation and it doesn't produce the stdout report neither)

Patrick Blaney
@pblaney
Hi Everyone,
I was wondering how splitText(by: 50, file: true) would treat the remainder if the input file being split is not an even multiple of 50. Would it simple create a file with the remaining number of lines?
Steven P. Vensko II
@spvensko_gitlab
Can someone please sanity check me real fast -- Is it possible to use both Singularity and Docker in a Nextflow script (for different processes)?
1 reply
Alex Mestiashvili
@mestia
a docker image can be converted to singularity image
5 replies
Raoul J.P. Bonnal
@helios
@pditommaso I am trying to create a new plugin. I am following the nf-hello and I can compileGroovy but when I run the launch I got a java.lang.NoClassDefFoundError for the dependency I need. I've added to the dependencies to nf-my_plugin/plugins/nf-my_plugin/build.gradle as implementation ... and gradle downloaded them in the ~/.gradle cache directory. I am missing something for sure.
4 replies
arnaudbore
@arnaudbore
Hi Everyone,
I try to put process.cleanup = true in the config file as suggested here https://github.com/nextflow-io/nextflow/pull/2135/files but it does not seems to work (using nextflow-21.09.0-edge-all). I definitively does not understand how to use this option. Can somebody help me on this one ? Thank you in advance.
2 replies
Arijit Panda
@arpanda

Hi Everyone,
I am new to nextflow. So can someone help me to write nextflow code that processes multiple samples with multiple paired fastq files. Here is link my query that I posted in stack overflow. https://stackoverflow.com/questions/69702077/nextflow-how-to-process-multiple-samples .

I can process single sample at once but not all. Here is my code for processing single sample.

params.sampleName="sample1"
params.fastq_path = "data/${params.sampleName}/*{1,2}.fq.gz"

fastq_files = Channel.fromFilePairs(params.fastq_path)

params.ref = "ab.fa"
ref = file(params.ref)

process foo {
    input:
    set pairId, file(reads) from fastq_files

    output:

    file("${pairId}.bam") into bamFiles_ch

    script:
    """
    echo ${reads[0].toRealPath().getParent().baseName}
    bwa-mem2 mem -t 8 ${ref} ${reads[0].toRealPath()} ${reads[1].toRealPath()} | samtools sort -@8 -o ${pairId}.bam
    samtools index -@8 ${pairId}.bam
    """
}

process samToolsMerge {
    publishDir "./aligned_minimap/", mode: 'copy', overwrite: 'false'

    input:
    file bamFile from bamFiles_ch.collect()

    output:
    file("**")

    script:
    """
    samtools merge ${runString}.bam ${bamFile}
    samtools index -@ ${bamFile}
    """
}
image.png
James Mathews
@jimmymathews

The channel.fromPath function seems to accept only bonafide path strings or glob patterns.
How can one make a channel from a path string which is itself only known by looking inside another file?

process retrieve_another_filename {
    input:
    path metadata_file

    output:
    stdout emit: other_filename

    script:
    """
    cat $metadata_file
    """
}

process show_file_contents {
    input:
    path other_filename

    output:
    stdout emit: contents

    script:
    """
    cat $other_filename
    """
}

workflow {
    metadata_file_ch = channel.fromPath('metadata.txt')
    retrieve_another_filename(metadata_file_ch)
    retrieve_another_filename.out.other_filename.view()

    // another_file_ch = channel.fromPath(retrieve_another_filename.out. ... ) ?

    show_file_contents(another_file_ch)
    show_file_contents.out.contents.view()
}

(Note: Trying to be DSL2-compliant.)

2 replies
KyleStiers
@KyleStiers
Has anyone seen nextflow pipelines (potentially more frequently a problem in ones that run a lot or on a cron job) throw errors with the JDK? I see hs_err_pidxxxx.log files show up in the directories where I run them on a slurm cluster describing segfault/sigbus errors. I found this mentioned in #842 and tried changing the flag and sometimes it works to resolve the issue (but potentially at the cost of ruining resuming which I would like to retain), but sometimes it just changes the nature of the error slightly. Any input on this would be great. I'll post the full error in a thread reply to this so it doesn't take up all the space on the page...[repost, but still haven't found any solution]
3 replies
Adam Price
@price0416

I am trying to do something fairly basic, but can't work out the groovy/nextflow-ish way to do it. Fairly new to nextflow/groovy.

I have two channels that look like this:

samples = ["sample1_a", "sample1_b" "sample2_a", "sample2_b", "sample3_a"]
pairs = [ ["sample1_a", "sample1_b"] , ["sample2_a", "sample2_b"], ["sample3_a", "sample3_b"] ]`

I want to select from samples the ones that have both matching pairs. So in a pseudocode logic, i want to accomplish this:

validList = []
for (i in pairs.size()):
if (pairs[i][0] in samples && pairs[i][1] in samples)
validList.add(pairs[i][0])
validList.add(pairs[i][0])

I feel like there should be a pretty straightforward way to use .map or something to accomplish this. Any ideas?

1 reply
Somak Chowdhury
@somakchowdhury
nextflow runs the test profile using docker but when running with samples specified in a csv file it says no input file specified. Can some one suggest what is going wrong?
Benjamin Wingfield
@nebfield

I want to run mawk -f hello_world.awk , and hello_world.awk lives in bin/. What's the nextflowy-est way to do this? (I know it seems weird)

I want to explicitly call mawk for portability (the binary can live in different places) and to use extra arguments like -v. Shebangs aren't portable. Env won't take multiple parameters unless I'm over or under thinking things. At the minute I'm doing some weird shell magic

W. Lee Pang, PhD
@wleepang
Is there a way to get a workflow's session hash before it is run?
hukai916
@hukai916
Hi folks, can anyone explain to me why the following snippet won't print out the exit msg to the console? Basically, I have to comment out the TEST_MODULE() line, otherwise the "Error msg" won't be displayed to console.
nextflow.enable.dsl = 2

workflow {
    process TEST_MODULE {
      script:
      """
      echo "test" > test.txt
      """
    }

    TEST_MODULE()
    exit 1, "Error msg"
}
Greg Gavelis Code Portfolio
@ggavelis
Hi there, When troubleshooting caching issues--does anyone know how to parse the outputs of the -dump-hashes option?
(I'm trying to understand why only a minority of my outputs cache, even though they run successfully. )
The tips on the 'troubleshooting nextflow resume' page suggest running nextflow twice with the -dump-hashes option, then comparing the differences between the logfiles. But what are we looking for exactly? I do not see errors in either file.
First logfile: https://pastebin.com/MgPMWLxX
Second logfile: https://pastebin.com/XgBz3pHs
9 replies
emily-kawabata
@emily-kawabata
Hi everyone,
This is more of a general question but does anyone have any tips or recommendation on when you want to create and maintain a directory structure from the start of the script to the end?
For example, I am writing a script that fetches sequence files from database, run clustering on them, create directory for each cluster and have sequences in them, and finally annotate each files.
I don't know if it's ideal to input and output the entire base directory for every process in the script. Also, when I use the qualifier "each", the directory structure is lost and I would like to know how everyone is working around this.
2 replies