These are chat archives for nextflow-io/nextflow

24th
Apr 2018
Ashley S Doane
@DoaneAS
Apr 24 2018 06:35
hi, following up on use of beforeScript, it works on a local machine, but I get errors when using sge, using commands that should be in my path: /home/asd2007/setup-env.sh: line 133: module: command not found
That is, I can ssh to the compute node and run the commands succesfully. What is the environment the beforeScript is run in? Any ideas on how to get this working? thanks ;)
Paolo Di Tommaso
@pditommaso
Apr 24 2018 06:36
what's your beforeScript command ?
Pierre Lindenbaum
@lindenb
Apr 24 2018 09:50

Hi all, is it possible to define a groovy/jaa function in my *.nf file and then use it in a scriptsection. I'm thinking of something like:

String picard(String name) { return "java -jar /path/to/picard.jar "+name);

(...)
script:
"""
${nextflow.picard("SortSam")} I=${ibam} O=${obam}
""""

thanks

Paolo Di Tommaso
@pditommaso
Apr 24 2018 09:51
exactly like that, except some syntax typos
String picard(String name) { "java -jar /path/to/picard.jar $name" } 

(...)
script:
"""
${picard("SortSam")} I=${ibam} O=${obam}
""""
long answer
you can define any helper function in your pipeline script
then you can invoke in the script without any special prefix
alternatively you can create an helper groovy (or even Java) class with a set of alter static methods
and then invoke using standard package.Class.method java syntax
saving those classes in the lib/ folder are compiled and added to the classpath on-fly
/end
:)
Luca Cozzuto
@lucacozzuto
Apr 24 2018 09:55
Hi @lindenb, I'm trying to define / group some functions like these here
have a look and if you like feel free to add
(still quite empty... :) )
Pierre Lindenbaum
@lindenb
Apr 24 2018 10:00
@pditommaso very cool thanks !
Paolo Di Tommaso
@pditommaso
Apr 24 2018 10:01
enjoy
Venkat Malladi
@vsmalladi
Apr 24 2018 13:49
@pditommaso not sure yet
Will know more once i have access to azure
Paolo Di Tommaso
@pditommaso
Apr 24 2018 13:50
same here :)
Venkat Malladi
@vsmalladi
Apr 24 2018 13:55
@lucacozzuto We are trying to do the same to allow for a central pipeline code repo to build multiple pipelines quickly
Luca Cozzuto
@lucacozzuto
Apr 24 2018 14:14
@vsmalladi nice! Can we "join" the efforts in some way?
Paolo Di Tommaso
@pditommaso
Apr 24 2018 14:15
both of you should join https://nf-core.github.io !
Alexander Peltzer
@apeltzer
Apr 24 2018 14:15
Yes please!
Paolo Di Tommaso
@pditommaso
Apr 24 2018 14:15
Luca is shy.. :joy:
Luca Cozzuto
@lucacozzuto
Apr 24 2018 14:21
:) how to join?
Paolo Di Tommaso
@pditommaso
Apr 24 2018 14:22
submitting a pipeline
Alexander Peltzer
@apeltzer
Apr 24 2018 14:22
Yup and then nf-core/nf-core.github.io#1
Venkat Malladi
@vsmalladi
Apr 24 2018 14:29
okay will do
Paolo Di Tommaso
@pditommaso
Apr 24 2018 14:35
Nice!
Phil Ewels
@ewels
Apr 24 2018 14:50

submitting a pipeline

You don't have to submit a pipeline to be involved :) There's a lot of conversation going on in the gitter room https://gitter.im/nf-core/Lobby

Venkat Malladi
@vsmalladi
Apr 24 2018 15:02
@ewels going to join
Going to also devlop at atac-seq pipeline
Phil Ewels
@ewels
Apr 24 2018 15:05
Cool! We have a ChIP-seq pipeline that I intend to move over pretty soon: https://github.com/SciLifeLab/NGI-ChIPseq
We use this for ATAC-seq as well at the moment
any inputs welcome though :+1:
Ashley S Doane
@DoaneAS
Apr 24 2018 15:06
@ewels @vsmalladi feel free to use any aspects of my pipeline https://github.com/DoaneAS/atacflow
Venkat Malladi
@vsmalladi
Apr 24 2018 15:06
@DoaneAS thanks
Ashley S Doane
@DoaneAS
Apr 24 2018 15:08
it's not as fancy as you guys :) but follows ENCODE best practice and it is also the code we are using in our bioformatics core
Phil Ewels
@ewels
Apr 24 2018 15:09
Cool stuff :+1: Looks like these all have quite a lot in common, which is great!
Venkat Malladi
@vsmalladi
Apr 24 2018 15:09
Great @DoaneAS I will also be following ENCODE best practices to build the pipeline and add additional qc
Phil Ewels
@ewels
Apr 24 2018 15:10
Shall we continue this discussion over at https://gitter.im/nf-core/Lobby ?
Venkat Malladi
@vsmalladi
Apr 24 2018 15:10
@ewels is the goal of all the nf-core piplines to all be the same
core functions
Paolo Di Tommaso
@pditommaso
Apr 24 2018 15:11
I love this community! :)
Phil Ewels
@ewels
Apr 24 2018 15:11
depending on what you mean by "be the same", then probably yes - to follow the same guidelines for how to operate, yes
See http://nf-co.re/ for details of what I mean here (new website still under construction, so apologies if it's rough around the edges)
Venkat Malladi
@vsmalladi
Apr 24 2018 15:13
@ewels thanks
Edgar
@edgano
Apr 24 2018 15:13
yeah, I am looking nf-core too... but it seems to go "to fast" to follow hahahaha
Venkat Malladi
@vsmalladi
Apr 24 2018 15:14
ya trying to use some of the same principles
Phil Ewels
@ewels
Apr 24 2018 15:28
hah, yes it has been going pretty quickly lately @edgano ;) You just need to jump in and go with the flow!
The spec has been settling down a bit though
Stephen J Newhouse
@snewhouse
Apr 24 2018 15:30
I dip into this gitter every now and then, like a geeky code stalker, and evrytime I
Am stoked at the conversation, and collaboration
And awesomeness of you all
+1 for nf-co.re
Phil Ewels
@ewels
Apr 24 2018 16:41
:tada:
Vladimir Kiselev
@wikiselev
Apr 24 2018 20:27
@pditommaso why when there are multiple input channels in a process, where one of this channels is a static folder/file, one has to add .collect() to the channel name? Otherwise, only one of the the first entry to the process gets executed.
Does my question make sense?
Paolo Di Tommaso
@pditommaso
Apr 24 2018 20:30
let me see if I find an answer in the google forum :)
can't find, I need to write a faq for this, anyhow
the point is that a process is executed when a channel contains a value, the execution consume that value
right?
Paolo Di Tommaso
@pditommaso
Apr 24 2018 20:49
Let's continue tomorrow :wave:
Steven Davis
@sgdavis1
Apr 24 2018 21:04

@pditommaso I was hoping I could ask you a question about NF memory management. I built a pipeline to do next-gen sequence analysis, and several tools can require a large amount of RAM (up to 30GB per process). Now in my main nextflow.config file I have specified a limit of memory usage for the local executor of 100GB (the machine has 128GB of RAM).

My questions:

  • Is this memory limit a hard limit that NF will never exceed? For example, if I start 3x 30GB processes, will a 4th start since we haven't hit 100GB total yet? Or is this a situation where I should explicitly tell each process how much RAM it is allowed (I am not yet doing that)?
  • If I specify a high limit (like in my example) and then other external processes start to use RAM+swap outside of NF, will I still experience an out-of-memory situation do to those external programs? (I am quite certain this is the case, just wanted clarification)

TIA for any insight you can provide!

Vladimir Kiselev
@wikiselev
Apr 24 2018 21:41
Sorry, @pditommaso, sure let’s continue tomorrow! ;-) I was thinking you would reply tomorrow anyway )