Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Oct 01 19:14
    ewels commented #3186
  • Oct 01 18:24

    pditommaso on testing

    (compare)

  • Oct 01 18:03
    sonatype-lift[bot] commented #3090
  • Oct 01 17:47
    pditommaso synchronize #3090
  • Oct 01 17:47

    pditommaso on enhanced-azcopy-opts

    Fix task resume when updating f… Bump nf-amazon@1.10.6 Signed-o… Bump nf-wave@0.4.4 Signed-off-… and 15 more (compare)

  • Oct 01 17:43
    pditommaso closed #3186
  • Oct 01 17:43
    pditommaso commented #3186
  • Oct 01 17:38

    pditommaso on 3186-remove-rolling-report-names

    (compare)

  • Oct 01 17:38
    pditommaso closed #3187
  • Oct 01 17:38
    pditommaso commented #3187
  • Oct 01 16:53

    pditommaso on master

    Get rid of file name rolling f… (compare)

  • Oct 01 14:18

    pditommaso on master

    Get rid of file name rolling f… (compare)

  • Oct 01 12:13
    sonatype-lift[bot] commented #3187
  • Oct 01 11:57
    pditommaso synchronize #3187
  • Oct 01 11:08

    pditommaso on master

    Add missing inputs to the incre… (compare)

  • Oct 01 11:08
    pditommaso closed #1442
  • Oct 01 11:07
    pditommaso commented #1266
  • Oct 01 11:05
    pditommaso closed #1108
  • Oct 01 11:05
    pditommaso commented #1108
  • Oct 01 11:02
    pditommaso closed #2928
mmatthews06
@mmatthews06
Hey all, is there documentation on a recommended method of running nextflow with a debugger, like in IntelliJ, for development purposes, to set breakpoints and whatnot? I've only just started looking for that specifically, but I would've thought I'd come across it now.
Riccardo Giannico
@giannicorik_twitter
@happykhan Hi, I believe you are searching for this construct here:
:point_up: June 4, 2019 12:39 PM
Combiz
@combiz_k_twitter
According to the HPC/QMUL docs on NF: "Using the SGE executor for parallel jobs causes the master job to hang until it is killed by the scheduler for exceeding walltime. This is due to Apache Ignite not being able to communicate to other pipeline scripts submitted as separate jobs.". Is this generally true of using SGE? Or is it their particular HPC config that means NF can only be used for serial jobs with SGE?
Stephen Kelly
@stevekm

@happykhan

how do i scoop out the demultiplexted reads from bcl2fastq, organize it in to read pairs

I do not do this inside Nextflow, I separate my demultiplexing pipeline from the rest of my analysis. I run a script on the demultiplexing output to coordinate the sample R1 R2 pairs into a new samplesheet as the input for the analysis pipeline.

Demultiplexing pipeline: https://github.com/NYU-Molecular-Pathology/demux-nf

downstream analysis pipeline: https://github.com/NYU-Molecular-Pathology/NGS580-nf

samplesheet generation (parsing of the R1 R2 pairs) happens here:
https://github.com/NYU-Molecular-Pathology/NGS580-nf/blob/4986e0a6a5eb9fec3e5016c8de29b60d5044df96/Makefile#L170

using this script:
https://github.com/NYU-Molecular-Pathology/NGS580-nf/blob/4986e0a6a5eb9fec3e5016c8de29b60d5044df96/generate-samplesheets.py

if you wanted to do it all inside one pipeline, then might want to use some of the functions of that script somehow to do the SampleID-R1-R2 pairing and then output to Nextflow in a new channel. Or if you are good with regex you might be able to do it natively in a Nextflow channel .map or something like that.

@combiz_k_twitter I have used Nextflow without issue on SGE. I am not sure what exactly they are referring to with that quote. A lot of HPC admins get really hung up on the idea of using array-jobs all the time for everything and have trouble understanding that Nextflow is managing the job dependency itself and submitting all the jobs individually.
I am not sure what you mean by "NF can only be used for serial jobs with SGE"
Stephen Kelly
@stevekm
I have never used Apache Ignite but I am not clear what it has to do with anything in that situation. If Nextflow can communicate with SGE and submit jobs, what does Apache Ignite have to do with it? Also what are they referring to as the "master job"? If you are submitting the parent Nextflow process in its own SGE job then you should just set an adequate time limit on the job's execution. I typically give mine 5 days on our current SLURM system
banjosnapper
@banjosnapper

Hi there, I am having an issue with trying to implement a perl script within my nextflow workflow. Is it possible to call a script within nextflow? or do you have to write the script within the process?

I have tried both ways and have currently had no success.
I am trying to convert a 'stringtieMerged.gtf' gene_id which gives the default output to the mirBase names that I have.
The perl script works outside of nextflow but I then have issues when placing it into the workflow. I have provided the code below.

This is the process that creates the merged list

process createList {
        module 'stringtie'
        publishDir "$baseDir/../output/stringtieGTF", mode: 'copy'
        tag "${listGTF}"
        errorStrategy { task.exitStatus == 0 ? 'retry' : 'terminate' }
        maxRetries 3
        maxErrors -1

        input:
        file listGTF from listGTF.collect()
        file gff from gffFile

        output:
        file "mergeList.txt" into mergeList
        file "stringtieMerged.gtf" into stringtieMerged

        script:
        """
        touch mergeList.txt
        ls -1 $listGTF > mergeList.txt
        stringtie --merge -o stringtieMerged.gtf -G ${gff} mergeList.txt
        """
}

This is the perl script that works perfectly fine outside of the workflow but I get either exit status 25 or 2

process swapID {
        publishDir "$baseDir/../output/stringtieGTF", mode: 'copy'

        input:
        file "stringtieMerged.gtf" from stringtieMerged
        file gff from gffFile

        output:
        file "temp.gtf" into stringtieMergedID

        script:
        """
        #!/usr/bin/env perl

        my \$gff = "mmuChr.gff3";
        my \$gtf = "stringtieMerged.gtf";

        # open GFF3 from mirbase and made a lookup

        my %lookup;   # key value obejct

        open FPIN, "<".${gff} or die;   # open file for reading


        while (<FPIN>) {        # loop over each line in turn

                if (/ID\\=([^;]+);.*Name\\=([^;]+)[\\t \\r\\n\\f;]+/) {    # if line contain this string  + is atleast one match, * zero or some

                                my (\$id, \$name) = (${1}, ${2});
                                        die if (exists \$lookup{\$id});         # don't really need this, but checks that the values isn't twice in the file
                                                \$lookup{\$id} = \$name;

                                                            }

                                                                }

                                                                close FPIN;

                                                                # open GTF and create new temp file with substituted names

                                                                open FPIN, "<".\$gtf or die;
                                                                open FPOUT, ">temp.gtf" or die;     # output to a temp file

                                                                while (my \$line = <FPIN>) {

                                                                        if (\$line =~ /; transcript_id \"(MI[^\"]+)\";/) {

                                                                                        my \$id = ${1};



         die \$id unless (exists \$lookup{\$id});

                 my \$id2 = \$lookup{\$id};



                                 # make substitution



                                                 \$line =~ s/gene_id \"[^\"]+\"/gene_id "\$id2"/;





                                                                     }




                                                                             print FPOUT \$line;



                                                                                 }


                                                                                 close FPIN;

                                                                                 close FPOUT;
"""
}
Any help would be much appreciated
Alaa Badredine
@AlaaBadredine_twitter
@banjosnapper have you tried to call your script in nextflow ? like
process swapID {
        publishDir "$baseDir/../output/stringtieGTF", mode: 'copy'

        input:
        file "stringtieMerged.gtf" from stringtieMerged
        file gff from gffFile

        output:
        file "temp.gtf" into stringtieMergedID

        script:
        """
        perl script.pl
        """
}
you can directly call your perl script within the script block of nextflow. So, you actually don't have to write perl code inside Nextflow, just give the path of your script and gives the right output and input and it should work
banjosnapper
@banjosnapper
@AlaaBadredine_twitter I get an error saying that my script.pl does not exist. Should I be putting my script in a particular place to be able to call it? It is within the same directory as the main.nf
Alaa Badredine
@AlaaBadredine_twitter
what's your script name ?
banjosnapper
@banjosnapper
I called it 'parseme.pl'
Alaa Badredine
@AlaaBadredine_twitter
replace script.pl by your actual script name
banjosnapper
@banjosnapper
That is what I did do
Alaa Badredine
@AlaaBadredine_twitter
ok so it would be perl /path/to/your/script/perseme.pl
you have to give the full path of your script
you can define it as a variable
banjosnapper
@banjosnapper
Okay I will try and give it the full path
Alaa Badredine
@AlaaBadredine_twitter
somewhere in nextflow
parseme = "/full/path/to/parseme.pl"
and then call it: perl $parseme
banjosnapper
@banjosnapper
Thank you that worked. However, I am still getting exir error status 2 when executing
Caused by:
  Process `swapID` terminated with an error exit status (2)

Command executed:

  perl /scratch/c.c1860369/nextFlow/bin/parseme.pl

Command exit status:
  2

Command output:
  (empty)

Command error:
  Died at /scratch/c.c1860369/nextFlow/bin/parseme.pl line 10.
banjosnapper
@banjosnapper
Okay so I worked out the isse. It was because I did not give an absolute path name to the mmuChr.gff3 in the perl script. Many thank for your help @AlaaBadredine_twitter
Alaa Badredine
@AlaaBadredine_twitter
@banjosnapper no problem
mmatthews06
@mmatthews06
@pditommaso or anyone else, did anybody catch my question about attaching a debugger to Nextflow? Or running Nextflow in IntelliJ in debug mode, to set breakpoints, etc.? I'm starting back working on that, just thought I'd ask for any quick hints, since I assume somebody has already done it.
Paolo Di Tommaso
@pditommaso
it's possible for the nextflow runtime development, but I guess you want for nextflow scripts
mmatthews06
@mmatthews06
No, runtime development. I'm mucking around Nextflow internals for the time being.
Paolo Di Tommaso
@pditommaso
then it's straightforward, use ./launch.sh -remote-debug run .. etc
mmatthews06
@mmatthews06
Ah, alright, I'll try that. Thanks!
Paolo Di Tommaso
@pditommaso
Workflow components in the pipeline :sunglasses:
Stephen Kelly
@stevekm
@banjosnapper put your scripts in a directory called bin adjacent to the main nextflow script; example here: https://github.com/stevekm/nextflow-demos/tree/1238d0c444f388cb1ee79c351a57610e03e4bbb6/R-Python
as long as the scripts are executable then you can just invoke them directly from within your task
Guillaume Theaud
@GuillaumeTh
@pditommaso I saw that now in the .command.run the values for the environment variable are written like MY_VAR=\"0\" whereas in the 19.04 the variable was MY_VAR="0". This modification is normal ?
Stephen Kelly
@stevekm

@banjosnapper instead of calling your script with perl myscript.pl you should instead put a shebang at the very first line that invokes the perl interpreter, see this Python script that includes one: https://github.com/stevekm/nextflow-demos/blob/1238d0c444f388cb1ee79c351a57610e03e4bbb6/R-Python/bin/test.py

this allows you to invoke the script as simply myscript.pl

Paolo Di Tommaso
@pditommaso
@GuillaumeTh :point_right: nextflow-io/nextflow#1146
Guillaume Theaud
@GuillaumeTh
@pditommaso thanks
@pditommaso The problem that I have in my case is that in the config file I set env {NB_PROCESSES=1} But with the fix the value of NB_PROCESSES="1" but some scripts requiered a int not a string. Do you have a idea to set a int instead of a string in the nextflow.config ?
Paolo Di Tommaso
@pditommaso
Don't think env vars distinguish between num and string
Stephen Kelly
@stevekm

sometimes when I restart a Nexflow pipeline with -resume, I get errors inside my processes such as java.lang.NullPointerException: Cannot get property 'outputDir' on null object. These come from processes that look like this:

process vcf_to_tsv {
    publishDir "${params.outputDir}/VEP/vcf_tsv", mode: 'copy'
    // stuff
}

so it seems like params is not always getting initialized correctly when the pipeline resumes.

I have the same error for dict objects that I initialize from reading in a JSON. Sometimes, the pipeline errors out when I try to access the object's keys inside a process. Problem is, all of these errors are completely random and not reproducible. Eventually if I restart the pipeline enough times with -resume, they go away. Or not, and I just start the pipeline from scratch... is this a known bug? I think we are still using 19.01

here is one such stack trace:
Dec-18 09:01:45.818 [PathVisitor-1] ERROR nextflow.Channel - Cannot get property 'outputDir' on null object
java.lang.NullPointerException: Cannot get property 'outputDir' on null object
        at org.codehaus.groovy.runtime.NullObject.getProperty(NullObject.java:60)
        at org.codehaus.groovy.runtime.InvokerHelper.getProperty(InvokerHelper.java:190)
        at org.codehaus.groovy.runtime.callsite.NullCallSite.getProperty(NullCallSite.java:46)
        at org.codehaus.groovy.runtime.callsite.GetEffectivePogoPropertySite.callGetProperty(GetEffectivePogoPropertySite.java:45)
        at _nf_script_121eedc3$_run_closure117$_closure442.doCall(_nf_script_121eedc3:2192)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:104)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:326)
        at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:264)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1041)
        at groovy.lang.Closure.call(Closure.java:411)
        at groovy.lang.Closure.call(Closure.java:405)
        at groovy.lang.GString.writeTo(GString.java:189)
        at groovy.lang.GString.toString(GString.java:153)
        at org.codehaus.groovy.runtime.typehandling.ShortTypeHandling.castToString(ShortTypeHandling.java:55)
        at nextflow.extension.Bolts.asType(Bolts.groovy:474)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.runtime.metaclass.ReflectionMetaMethod.invoke(ReflectionMetaMethod.java:54)
        at org.codehaus.groovy.runtime.metaclass.NewInstanceMetaMethod.invoke(NewInstanceMetaMethod.java:56)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:326)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1235)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1041)
        ......
Paolo Di Tommaso
@pditommaso
it seems an app issue, outputDir is not a NF variable
Stephen Kelly
@stevekm
It is defined inside the main.nf pipeline;
params.outputDir = "output"

process vcf_to_tsv {
      publishDir "${params.outputDir}/VEP/vcf_tsv", mode: 'copy'
....
....
}
this way the user is able to override it from the CLI
nextflow run main.nf --outputDir foo/
I thought this was standard practice?
Paolo Di Tommaso
@pditommaso
weird, try the latest version, if the problem persits open an issue
Stephen Kelly
@stevekm
ok
what was the arg to disable the ANSI logger in the latest?
Paolo Di Tommaso
@pditommaso
-ansi-log false