Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 07:33
    pditommaso commented #3443
  • 07:32
    pditommaso milestoned #3353
  • 07:31
    pditommaso closed #3353
  • 07:31
    pditommaso commented #3353
  • 07:31

    pditommaso on master

    Add warning on Google Logs fail… (compare)

  • 07:20
    pditommaso commented #3464
  • 07:19

    pditommaso on master

    Fix Quote the logName in the Cl… (compare)

  • 07:19
    pditommaso closed #3464
  • 06:49

    pditommaso on master

    Fix a few issues in BatchLoggin… (compare)

  • 06:49
    pditommaso closed #3443
  • 06:49
    pditommaso commented #3443
  • 06:32
    pditommaso closed #3411
  • 06:32
    pditommaso locked #3411
  • 06:22
    pditommaso synchronize #3443
  • Dec 02 22:42
    robsyme commented #3466
  • Dec 02 22:34
    robsyme commented #3465
  • Dec 02 22:33
    robsyme synchronize #3466
  • Dec 02 22:30
    robsyme opened #3466
  • Dec 02 21:25
    aaronegolden commented #3353
  • Dec 02 21:17
    robsyme opened #3465
ziltonvasconcelos
@ziltonvasconcelos
version 21.10.6 build 5660
Paolo Di Tommaso
@pditommaso
I've had this same problem recently, however it's not clear what it's happening
please move this discussion here
shenkers
@shenkers

I'm using DSL2, and I've been building a workflow where I want to allow the user to use different "flavors" of the workflow by specifying command line parameters. I started by using the if()/else() blocks described in the documentation, but I found that when I have multiple configurable steps this pattern explodes into nested conditional logic that becomes hard to read.

With a little fiddling I found that I could define a variable that holds a reference to a runnable step, and then use this variable in the workflow block as though it was a regular process or workflow. I haven't seen this described in the documentation, but it was a concise way of describing what I wanted. Is there a more conventional (syntactic sugar) "nextflow-way" of doing this?

to give a more concrete example, the pattern I was using was:

process variant1 {
...
}

process variant2{
...
}

def variable_step = params.flag ? variant1.&run : variant2.&run

workflow {
  step1()
  variable_step()
  step3()
}

Do you think this is a "safe" thing to do? will it be stable-ish/compatible with future versions of nextflow?

shenkers
@shenkers

when I don't define that variable_step such that it points to the run method, or bind call() to run(), e.g.:

variable_step.metaClass.call = variable_step.&run

then I get this error:

Missing process or function with name 'call'

when a process-def is "called" in a workflow context is there something that does call()->run() binding at runtime? Why does assigning it to an intermediate variable change how the binding happens?

Paolo Di Tommaso
@pditommaso
hacking internal structures is not guaranteed to work. it's not enough an if else to eachive the same?
kaitlinchaung
@kaitlinchaung
Does anyone know what to do if a job finishes with Exit Code 0 but is still marked as 0? the .command.err and .command.out files are empty, and the job indeed finished successfully
Alex Mestiashvili
@mestia
How can I print a message if a channel is empty? I've tried something like that, but it prints the channel content when it is not empty files_ch.ifEmpty("emtpy").view() Also is there a way to exit the workflow if a channel is empty?
Tim Dudgeon
@tdudgeon

I'm having a problem defining optional outputs in DSL2. I have a workflow where part of it is optional:

main:
process1()
if (params.run_extra) {
  process2()
}

That works fine. But now I want to emit the outputs. I've tried:

emit:
process1.out
if (params.run_extra) {
  process2.out
}

But that doesn't seem to be allowed.

Any ideas?

Paolo Di Tommaso
@pditommaso
you need a conditional expression
1 reply
emit:
some_ch = params.run_extra ? process2.out : some_default
Jong Ha Shin
@JjongX
JjongX.png
jerovign
@jerovign
Hello, I was wondering if there is a way to ask nextflow to resubmit the run again and again. Because sometimes the job crashed due to incompatibility with the system, but it is random error. Running again the job with -resume function solve the problem.
3 replies
chbk
@chbk
#2546 was closed so I guess that functionality is not supported with DSL2 anymore. As an alternative I am attempting to execute Groovy code in a Closure provided to the subscribe method. Is it possible to catch an error that is thrown at that point? Basically something like this:
Channel.of(1, 2, 3).toList().subscribe({ list ->
  // Raise error
  throw new Exception("Something in the list is not valid")
})
// Would like to catch the error and stop execution here
println("Execution continues anyway") // This shouldn't be printed but it is
9d0cd7d2
@9d0cd7d2:matrix.org
[m]
Hi, I'm trying to bind one directory form the localhost to a Singularity container. I tried both containerOptions --volume /data/db:/db on the process section and also singularity.runOptions = '-B /data/db:/db' but I cannot mout it properly
I need to modify something on the Singularity configuration to allow this?
singularity.autoMounts = true is defined on my profile section too
I'm trying something like this, just for test:
echo true
containerOptions '--volume /tmp:/tmp/tmp-mounted'

script:
"""
cat /etc/*release >> /tmp/tmp-mounted/testing-bind.txt
echo "Hello world! From Singularity container" >> /tmp/tmp-mounted/testing-bind.txt
touch /tmp/tmp-mounted/thepipelinehasrun
ls /tmp/tmp-mounted/
cat /tmp/tmp-mounted/testing-bind.txt
"""

}
with a profile:
profiles { singularity { singularity.enabled = true singularity.autoMounts = true process.container = 'alpine.3.8.simg' } }
1 reply
ChillyMomo
@ChillyMomo709

Hi all,

It seems I still have 'configuration conflict' when I run awsbatch. Like following bug: nextflow-io/nextflow#2370 .

Configuration conflict
This value was submitted using containerOverrides.memory which has been deprecated and was not used as an override. Instead, the MEMORY value found in the job definition’s resourceRequirements key was used instead. More information about the deprecated key can be found in the AWS Batch API documentation.

Nextflow version

  Version: 21.10.6 build 5660
  Created: 21-12-2021 16:55 UTC 
  System: Linux 5.11.0-1022-aws
  Runtime: Groovy 3.0.9 on OpenJDK 64-Bit Server VM 1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
  Encoding: UTF-8 (UTF-8)

How to solve this?

ChillyMomo
@ChillyMomo709
Issue does not seem to happen with NXF_VER=21.04.1 nextflow run main.nf
Jeffrey Massung
@massung
Is there a process directive I can use to fail a workflow? Maybe I can just throw an exception , but not sure if there's something nicer I should do instead? I basically have an if block in the directives section of a process, and if it's false I want to stop running and fail.
Nathan Spix
@njspix

I have a workflow that involves splitting up files per chromosome and then merging them later. To make the workflow a bit more flexible, I first pull the chromosomes out of the reference file using a bit of grep, so I have a channel with all the chromosome names. I can then do something like this:

input:
tuple val(id), path(file) from channel_a
each chr from chromosomes

output:
tuple val(id), val(chr), path(outfile) into channel_b

then I can group things up:

channel_b.groupTuple(by: 0)

and use that as input for the next process.
My question is, since the number of chromosomes is constant for any given run of the workflow, can I extract that value (e.g. map{ it.readLines().size() }) and feed that into groupTuple? I thought perhaps I could assign this value to a variable and then pass that variable to the groupTuple call but this doesn't work (the type of the variable is something fancy, not an Int).

1 reply
Shellfishgene
@Shellfishgene
The docs say that the merge operator will be removed soon. What's the replacement?
9d0cd7d2
@9d0cd7d2:matrix.org
[m]

Hi, this config worked for me :

singularity.enabled = true singularity.runOptions = "--bind /path:/path"

many thanks for your suggerence @tomraulet , finally Is working for me to!

xmzhuo
@xmzhuo

can I generate a val ouput emit from a script inside a process?
I tried to emit as val

process test {
output:
   val  val_var , emit: val_var
  shell:
"""
val_var=test
"""
}

Error:
Caused by:
Missing value declared as output parameter: val_var

I also tried to emit as env

process test{
output:
   env  val_var , emit: val_var
  shell:
"""
val_var=test
"""
}

When I tried to use it in the downstream process by calling test.out.val_var
Caused by:
No such property: val_var for class: ScriptC3863517AF925202A24F63BCD0003707

2 replies
Stathis
@isthisthat

hello, I'm just starting out in nextflow and seeking some strategic advice. What I'm trying to achieve is to run a workflow for a list of input samples. I can setup a workflow for a single sample, but how do I push multiple samples through it? (most examples in the docs show a single process) Here's where I'm stuck:

params.samples = ["samples/a","samples/b","samples/c"]
process step1 {
    input:
    file sample from samples_ch
    output:
    file 'result.bam' into step1_ch
    ...
}
process step2 {
    input:
    file bam from step1_ch
    output:
    file 'result.vcf' into step2_ch
    ...
}

This runs for sample a but not the rest, I'm suspecting because step2 only accepts one thing from step1_ch?
I can see two general strategies, either make a workflow for a single sample and then import that into a multi-sample wrapper, or enable each process to accept multiple inputs? Any advice would be greatly appreciated! Thanks

1 reply
Håkon Kaspersen
@hkaspersen
Hello everyone, I am converting my pipeline to use singularity images. I use images from biocontainers. I am trying to make my pipeline portable, and for now I have created a script that download these images to a user-specified directory to run from. However, some containers, such as R with specific packages, I could not find another solution than to build the image myself using a .def file with singularity build. This proved to work, but I am unsure how to make this portable. What are best practices regarding images, and how should i do this to optimize functionality and user-friendlyness?
9d0cd7d2
@9d0cd7d2:matrix.org
[m]
Hi, I tried to join nf-tower gitter channel, do somebody know if it's only for Enterprise users? Thanks in advance
maxulysse
@maxulysse:matrix.org
[m]
It's for everyone
9d0cd7d2
@9d0cd7d2:matrix.org
[m]
I cannot join the channel from the url, maybe need that somebody from the team puts me inside
Laurent Modolo
@l-modolo:matrix.org
[m]
Hi, I am trying unsuccessfully to implement the feedback loop pattern in DSL2. Is it possible to implement ?
4 replies
ebioman
@ebioman

if I see that correctly, the error strategy "ignore" will lead to "workflow.success"=true at the end.
I would like if running many samples to indeed ignore one failing one, but then check at the very end if any of them failed.
Is there a trace/object which can be accessed in main.nf where one could verify that at the end ? Something like

Sample success
1 true
1 true
1 true
1 true
1 false
1 true
1 true

This would allow to clean e.g. published data which would be otherwise orphan files

ebioman
@ebioman
Sorry cant edit, but obviously is should be different sample, doh
tkwitsil
@tkwitsil

I am having an issue where an imported module has an implicit workflow.onComplete handler. When I run the main workflow, the imported workflow.onComplete handler is being triggered, I assume because wf2's "workflow" is in the wf1 namespace. Example code:

# wf1.nf
include { subworkflow } from "./wf2"

// wf1 implicit workflow
workflow {
    main:
        println('wf1 implicit workflow called')
}

// pulls wf2 implicit workflow.onComplete into this namespace and executes 

----------------------------------------
#wf2.nf
//explicitly named workflow that is imported to wf1
workflow subworkflow {
    main:
        println('wf2 as subworkflow called')
}

// wf2 implicit workflow
workflow {
    main:
        println('wf2 implicit workflow called')
}

// wf2 implicit workflow.onComplete handler
workflow.onComplete {
    log.info('wf2 implicit workflow completed')
}

Command and output is:

$ nextflow run wf1.nf 
N E X T F L O W  ~  version 21.10.2
Launching `wf1.nf` [awesome_euclid] - revision: e050c16fcb
wf1 implicit workflow called
wf2 implicit workflow completed

Is there a way to avoid this namespace clash while keeping the workflow.onComplete handler for wf2? Or do I need to pull out the subworkflow in example above to it's own separate file and have wf1 import directly from that?

pouya ahmadvand
@pouya1991
Hi I am trying to use Nextflow to run singularity containers of my experiment pipeline. I am using the slurm as the executor. The problem I am facing is the the singularity has not been added to the node path and I then the Nextflow is not able to find the singularity and give this error:
env: ‘singularity’: No such file or directory
I submit my jobs through the head node which is different from the execution node.
I am wondering is there any way to append a custom path for singularity command look up?
Thanks
Peter Evans
@peterkevans
Hi all, first time poster... Sorry if this question has been asked but I'm struggling to find an answer.
I have a process that creates a bunch of fastq files and need to create a channel from them in order to scatter the next process. I'm believe I need something like the following output statement but I'm having trouble working out how to set sampleId to * i.e. the to get the sampleId from the file name. Any help would be greatly appreciated. (I'm using DSL2)
  output: 
  tuple val(sampleId), path("fastq_files/*_R{1,2}_001.fastq.gz"), emit: fastq
3 replies
awgymer
@awgymer

I have a situation where I need some dynamic input values for a process which I will need to fetch from a datastore as part of the pipeline. I was wondering what the best/accepted way of getting these values available to the process as variables is? My initial thought is to have the script that grabs the values from the datastore output a JSON file and then use a JSON reader in the process that requires them to access them?

Something like:

proc1 {
     output: 
        path patient_data.json
     script:
     """
     python get_patient_data.py
     """
} 

proc2 {
     input:
         path patient_data_file
         path other_file 
    output:
         path some_output.file
    script:
    patient_data = jsonSlurper.parse(patient_data_file)
    """
    the_command --opt1 ${patient_data['val1']} --opt2 ${patient_data['val2']} other_file
    """
}

Is this a reasonable solution? (I am aware the actual code above won't work because I haven't properly created the jsonslurper)

Pablo
@pablo-esteban:matrix.org
[m]

Hi all, I am getting a java.nio.file.ProviderMismatchException when I run the following script:

process a {
    output:
        file _biosample_id optional true into biosample_id

    script:
    """
    touch _biosample_id
    """
}

process b {
    input:
        file _biosample_id from biosample_id.ifEmpty{file("_biosample_id")}

    script:
    def biosample_id_option = _biosample_id.isEmpty() ? '' : "--biosample_id \$(cat _biosample_id)"
    """
    echo \$(cat ${_biosample_id})
    """
}

i'm using a slightly modified version of Optional Input pattern.

Any ideas on why I'm getting the java.nio.file.ProviderMismatchException?

Tim Dudgeon
@tdudgeon
Is it possible to declare that a task2 must wait for task1 to complete before it starts in the case where task1 does not create an output that can be fed to task2. In my case task1 writes to a directory and task2 reads data from that directory that task1 has created, but there is no specific output of task1. I can probably fabricate an output, but that sounds messy. To exemplify (with DSL2):
workflow mywf {

    take:
    data_dir

    main:
    task1(data_dir)
    task2(data_dir) // should wait for task1 to complete before starting
}
1 reply
Moritz E. Beber
@Midnighter
Hi, I was wondering if any of the groovy specialists have a good solution for the following: I have in a channel a tuple consisting of a hash map, a FastA file, and a CSV file. I would like to transform this in such a way that I get the hash map and FastA file plus a value from the CSV for each row in the CSV file. Thank you for any pointers.
Alaa Badredine
@AlaaBadredine_twitter
Hello, I would like some assistance with Nextflow operators please. I got many samples and each one has been fragmented and processed by chromosomes in the pipeline. At the end, I would like to collect all the chromosomes belonging to the same sample. Using .collect() ends up generating more than 9k symlinks files in the same folder, for each sample. Is there any way to collect them separately ?
1 reply
Kale Kundert
@kalekundert
Is it possible to use the DSL2 pipe operator with processes that have more than 1 input/output? For example, here's a snippet where I want to run process b 3 times on the output of a:
nextflow.enable.dsl = 2

process a {
    input:
        val x
    output:
        val y
    exec:
        y = x.toUpperCase()
}

process b {
    input:
        val x
        val n
    output:
        val y
    exec:
        y = "$x$n"
}

workflow {
    x = channel.value('a')
    n = channel.of(1..3)

    // I know these lines would work.
    //p = a(x)
    //b(p, n) | collect | view

    // Is there any way to do it all in one pipeline?
    a(x) | b(???, n) | collect | view
}
Richard Corbett
@RichardCorbett

I want to get a channel of the form:
['lib1', 'species1']
['lib1, 'species2']
['lib2', 'species3']

My process is parsing a kraken2 report text file to find any species present present above some threshold per lib:

process select_species {
    input:
        tuple val(library_id), path(kraken_report)
                val(threshold)

    output: 
                tuple val("${library_id}"), stdout , emit: species_list

    script:
    """
        awk '\$1>${threshold} && \$4=="S" { print \$NF }' ${kraken_report} | grep -v sapiens
    """
}

This gives me a tuple that contains newlines, but I feel like I'm only 1 magic nextflow command away from getting my desired output.
Current output:

[P01900, coli
ananatis
oryzae
acnes
barophilus
VB_PmiS-Isfahan
]

Desired output:

[P01900, coli]
[P01900, ananatis]
[P01900,barophilus]
...
Bioninbo
@Bioninbo
Hi all. I put all my scripts in the bin folders. But when I call R script from R it cannot find them. I.e. source('my_script.R') gives me " No such file or directory". For bash scripts it works though. I can run "my_script.sh" with specifying the path. But for perl script I also need to give the path, i.e. perl "${projectDir}/bin/my_script.pl".
1 reply
maxulysse
@maxulysse:matrix.org
[m]
I usually do just my_script.r
Bioninbo
@Bioninbo
I make my call from within R, to import functions.