Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Jul 06 22:13
    bentsherman commented #3013
  • Jul 06 22:11
    bentsherman labeled #3013
  • Jul 06 22:10
    bentsherman commented #3013
  • Jul 06 22:07
    bentsherman edited #3013
  • Jul 06 21:32

    pditommaso on wave

    Wave plugin 0.1 [WIP] [ci fast]… (compare)

  • Jul 06 18:48
    campanam opened #3013
  • Jul 06 15:48
    bentsherman synchronize #2799
  • Jul 06 15:48

    bentsherman on docs-subpages

    Improve azure docs [ci skip] S… Merge branch 'master' into ben/… (compare)

  • Jul 06 14:16
    bentsherman commented #3012
  • Jul 06 14:14
    bentsherman labeled #3012
  • Jul 06 14:14
    bentsherman labeled #3012
  • Jul 06 12:29

    pditommaso on master

    Allow to override lsf.conf sett… (compare)

  • Jul 06 12:29

    pditommaso on lsf-config

    (compare)

  • Jul 06 12:29
    pditommaso closed #3009
  • Jul 06 09:37

    pditommaso on master

    Add docs aws.client.s3PathStyle… (compare)

  • Jul 06 09:37

    pditommaso on docs

    (compare)

  • Jul 06 09:37
    pditommaso closed #3000
  • Jul 06 05:53
    marcodelapierre edited #3012
  • Jul 06 05:51
    marcodelapierre opened #3012
  • Jul 05 18:41

    pditommaso on master

    Add support for gbch v1 Signed… (compare)

hukai916
@hukai916
Hello all,
Does anyone know if it is possible to mix more than one language in the script block? E.g. run a specific bash command first, then run the rest with Python? Thanks!
2 replies
Pavel Borobov
@blvp

Hi all, How much does it cost to run a workflow using tower and aws-batch + fsx lustre? I'm working on a customer project right now and want to evaluate this option. Perhaps some guidance to the documentation and pricing model is needed.
Using aws-batch mode with s3 only work directory is not very efficient.

What is the commercial license price

Steffen Fehrmann
@sfehrmann
:point_up: August 11, 2019 12:50 PM
Hi all, I'm looking for a method to convert paired end data from bcl2fastq process output into fastq file pair tuple. @happykhan's method is the only I found. Is there some more elegant way with DSL2? Currently I have a standard bcl2fastq process that emits *.fastq.gzand I can read in the sample sheet, but I'd need to match sample sheet names and filename. Is there some way to use something like the.fromFilePairs factory on an existing channel?
Ernesto Lowy
@elowy01
Hi, I've a question on a process using a Python block and emitting files in DSL2.
This is the workflow:
nextflow.enable.dsl=2
process createFiles {
    output:
    path("*.txt", emit: apath)

    script:
    """
    #!/usr/bin/env python

    filenames = ['a.txt', 'b.txt', 'c.txt']

    for f in filenames:
        with open(f, "w") as wf:
            wf.write("hello\\n")
        print(f)
    """
}
process printContent {
    input:
        path(x)

    script:
    """
    cat $x
    """
}

workflow {
    createFiles()
    printContent(createFiles.out.apath)
}
The first process (createFiles) creates the files and emit a channel with the paths of the created files using a Python block and the second (printContent) prints the contents of each file.
My question is:
When I run this workflow. I see the following information from NF:
executor >  local (2)
[67/0c496d] process > createFiles  [100%] 1 of 1 ✔
[1b/36a13c] process > printContent [100%] 1 of 1
Ernesto Lowy
@elowy01
And it seems that the three files generated by createFiles are emitted in a single channel and all files are taken together by printContent instead of being analysed in an independent printContent job each.
7 replies
Can you let me know how to analyse independently each of the files?
Thanks
Luca Cozzuto
@lucacozzuto
Hi @elowy01 :) you can either make the file names as a channel outside your block of code as suggested by @pcantalupo or use the flatten operator
Paul Cantalupo
@pcantalupo
@lucacozzuto where do you add flatten? Can you post the code? thank you
Luca Cozzuto
@lucacozzuto
here
workflow {
    createFiles()
    printContent(createFiles.out.apath.flatten())
}
Paul Cantalupo
@pcantalupo
ahh, I tried that but forgot the (). Thank you
4 replies
Luca Cozzuto
@lucacozzuto
Hi all, do you know how to change the behavior when uploading the bin to AWS ?
I found that soft links are not preserved and lost
I opened an issue but not sure though nextflow-io/nextflow#2427
Luca Cozzuto
@lucacozzuto
Well it looks like links are not possible in S3. Mmm
Luca Cozzuto
@lucacozzuto
Well another question. Is there any variable that nextflow sets when uploading the bin folder to S3? For accessing that folder from any process...
cc @pditommaso :)
John Ma
@JohnMCMa

Is it possible to create files with the native execution mode of a process? For example, I attempted the following:

process WRITE_FASTP_METRICS{
    input:
        val (rna_result)
        val (adt_result)
    output:
        path "fastp_metrics.csv"
    exec:
        write_out = file("fastp_metrics.csv")
        rna_result.forEach{key, value ->
            write_out << key << ',' << value << '\n'
        }
        adt_result.forEach{key, value ->
            write_out << key << ',' << value << '\n'
        }
}

But the fastp_metrics.csv is not created in the work directory, causing this error: Missing output file(s) ``fastp_metrics.csv`` expected by process ``WRITE_FASTP_METRICS (1)``

anoronh4
@anoronh4

i'm wondering if we can pass in a container as a variable, as i want to test the same process over various versions of a software. something like this:

process A {
container= container_label

input:
tuple val(container_label), path(inputFile)
...
}

this code did not work, however. can it be done in another way?

rpetit3
@rpetit3:matrix.org
[m]
You could make a parameter to do it at run time, something like
process A {
container= params.container_label

input:
path(inputFile)
...
}
anoronh4
@anoronh4
isn't that still the same issue? params.container_label is just one value, i still want the input channels to affect the container directive
1 reply
i think i got it:
process A {
input:
tuple val(container_label), path(inputFile)
...
script:
task.container = container_label
...
}
rpetit3
@rpetit3:matrix.org
[m]
let us know if that works!
anoronh4
@anoronh4
@rpetit3:matrix.org it does!
rpetit3
@rpetit3:matrix.org
[m]
nice to know! thanks for sharing
Luca Cozzuto
@lucacozzuto
Mmm why not passing it as a parameter? I'm passing a number of things in my workflows:
emily-kawabata
@emily-kawabata
Hi everyone,
Does anyone know if there will be any nextflow workshop in the near future? I see that there was one in July of 2020 and another one in May of this year hosted by ecseq and was wondering if anyone knows if a similar event will be taking place in the future.
2 replies
9d0cd7d2
@9d0cd7d2:matrix.org
[m]
Hi all! I'm very interested on the tool as seems that covers a lot of the integrations that we need to a particular project (Slurm, buckets, Singularity, etc), but my worries are that our project is related mostly on CFD workflows and a small part on AI, and aparently Nextflow seems quite relatated with bio and genomics workflows. Dou you think that we can use it anyway?
5 replies
xmzhuo
@xmzhuo
Hey All,
For azurebatch, is it possible to define two pools type (with autoScale for different vmType) in autoPoolMode?
Ghost
@ghost~61847cca6da037398489d4e6
In watchPath, is it ok to use wildcard in directory and in file name at the same time? For example: watchPath('/myfolder/*/logs/*.log', 'create'). It doesn't seem to work for me.
zhemingfan
@zhemingfan
Hi everyone, I'm relatively new to Nextflow. For the following code, I'm getting an error where I'm unable to retrieve the index file ([E::idx_find_and_load] Could not retrieve index file for 'merged_sorted.vcf.gz') even though the folder points to the correct path, and running this command normally outside of Nextflow works fine. Would anyone happen how to fix this?
    process generate_readset {
tag "$sample_id" 
cpus 48
input:
    tuple val(read_name), val(chromosome1), val(chromosome2), val(cuteSV_pos1), val(cuteSV_pos2), 
    val(sniffle_pos1), val(sniffle_pos2), 
    path(cuteSV_vcf), path(sniffles_vcf) from vcf_input
output:
    path 'complete_read_set.txt' into receiver
script:
"""
${bcftools_1_11} view --threads ${task.cpus} $cuteSV_vcf -r chr$chromosome1:$cuteSV_pos1-$cuteSV_pos2 > complete.txt
"""}
Ghost
@ghost~61847cca6da037398489d4e6
@zhemingfan You need stage the vcf idx file. Add the idx file as input would solve your problem. In addition, your output file name has to be complete_read_set.txt.
Paul Cantalupo
@pcantalupo
I'm having problems pulling a github repo with nextflow. The repo is private and part of an Organization of which I am an Owner. I created an SCM file with my personal username and am able to pull personal private repos with nextflow pull. But when I try to pull a private Organizational repo, I get the following: Remote resource not found: https://api.github.com/repos/PATH/TO/contents/main.nf. What am I doing wrong?
xmzhuo
@xmzhuo

Hey All,
I have a error relate to nextflow azurebatch. The first process using a default D4_v3 vm work alright, but the second process I fail to request a larger vm (I set it in queue, but apparently, it is not working, do I make some naive mistake?)
'''

Error executing process > 'secondprocess'

Caused by:
Cannot find a VM for task 'secondprocess' matching this requirements: type=Standard_D4_v3, cpus=16, mem=14 GB, location=eastus
'''

The config file I used:

process {
       executor = 'azurebatch'
}

docker {
    enabled = true
}

azure {
  batch {
    location = 'eastus'
    accountName = 'xxxbatch'
    accountKey = 'xxx'
    autoPoolMode = true
    allowPoolCreation = true 
    deletePoolsOnCompletion = true 
    deleteJobsOnCompletion = true 
    pools {
        small { 
            autoScale = true
            vmType = 'Standard_D4_v3'
            vmCount = 5
            maxVmCount = 50
        }    
        large { 
            autoScale = true
            vmType = 'Standard_D16_v3'
            vmCount = 5
            maxVmCount = 50
        }    
    }
  }
  storage {
    accountName = "xxx"
    accountKey = "xxx"
  }
}

process {
    withName: firstprocess {
        queue = 'small'
    }
   withName: secondprocess {
        queue = 'large'
    }
}
Greg Gavelis Code Portfolio
@ggavelis
Hi all, Simple question: I'm trying to create a channel from a list of tuples. val(sample_id) and path(input_file). Though I've check that the input file exists I am encountering errors ['path value cannot be null.'] Does anyone know what is wrong with my syntax? https://pastebin.com/YbSG28a3. BTW: Are tuples even the best approach for this? Since the sample_id string is embedded in the path to the input_file, shouldn't there be a way to extract the sample_id from file_path?
3 replies
Hugues Fontenelle
@huguesfontenelle
Hello @pditommaso and friends :-)
I'm wondering, is there any reason with the cpus directive is not used with the local executor? (using the --cpus docker params) After all, the Docker param --memory is used.
10 replies
Matteo Schiavinato
@MatteoSchiavinato
I have a question that perhaps it isn't very complicated to answer: is it possible to pass to nextflow a dataframe of input files / information to read (e.g. one sample per line) instead of passing paths in multiple channels and then combining them in a single tuple? I was surprised not to find much about it online, so I wondered if the option existed or not.
4 replies
Bede Constantinides
@bede
Hi all, I'm using a shared system where homedirs have quotas, and keep getting disk quota errors with nextflow run, despite specifying an unquota'd path with -w aka -work-dir. Any ideas? In this thread, Paolo suggests -w is the solution… https://groups.google.com/g/nextflow/c/401Tp_6H57k/m/va8ACNeTAQAJ
2 replies
$ nextflow run nf-core/viralrecon -w /users/xxx/test --help
N E X T F L O W  ~  version 21.04.0
Pulling nf-core/viralrecon ...
Disk quota exceeded
Bede Constantinides
@bede
Solution: set NXF_ASSETS to a path without a quota
Ignacio Eguinoa
@ieguinoa
Hi all, I was wondering if there is any resource to help parsing a nextflow file/s into a tree of the syntax elements (AST), ideally using Python. I'm working on a small tool that tries to parse some metadata from the process, channels, etc and need to parse the workflow definition into objects.
similar to what the javalang packages does in python to parse generic Java code (https://pypi.org/project/javalang/)