These are chat archives for nextflow-io/nextflow

11th
Apr 2019
chdem
@chdem
Apr 11 07:23
Hello NXF community ! I have a very little question. Would you be agree with this syntax ? :
file(params.output_dir, type: dir, checkIfExists: true)
Rad Suchecki
@rsuchecki
Apr 11 07:25
Looks like Channel.fromPath() syntax not file()
not sure what exactly is the question tho :wink:
chdem
@chdem
Apr 11 07:26
@rsuchecki It is indeed in channel syntax, as in file syntax : https://www.nextflow.io/docs/latest/script.html?highlight=checkifexists#files-and-i-o
I got an error with this syntax :
ERROR ~ No such variable: dir
can't understand where's my mistake oO'
as everything seems ok according to the documentation
Rad Suchecki
@rsuchecki
Apr 11 07:27
possibly 'dir'?
chdem
@chdem
Apr 11 07:28
[ashamed] pfffff .... thanks @rsuchecki .... I definitly need my morning coffee....
Paolo Di Tommaso
@pditommaso
Apr 11 07:29
: coffee: :coffee: :coffee: :smile:
Rad Suchecki
@rsuchecki
Apr 11 07:29
:100:

Are there any cluster-specific guidelines for installing NF? The standard user-install works fine fore me, but curious if it should be different to make it available as module?

Or is is meant to always be installed in user space?

Paolo Di Tommaso
@pditommaso
Apr 11 07:36
That was number one requirement, but it's perfectly fine to install it inside
Oops
*to install it in a centralised manner
(on mobile)
Rad Suchecki
@rsuchecki
Apr 11 07:39
Right, are there any guidelines or how to set things up?
This can wait - gitter on mobile can be a pain - still haven't found a back-tick `` :grin:
Ólavur Mortensen
@olavurmortensen
Apr 11 09:42
When I want to use an output from a process more than once, what is the best way to store the output? Storing it in two different channels? Storing it in the same channel twice?
Maciej Pawlaczyk
@Fizol
Apr 11 10:22
Hi guys, I want to configure nextflow logger (I suppose it's logback). How I can do it?
Maciej Pawlaczyk
@Fizol
Apr 11 10:29
Especially, is it possible to inject eg. logback.xml, I have pretty complex configuration
Jonas T Björkman
@tintin42
Apr 11 12:52
Question regarding how jobs are handled when we submit them into our cluster using Slurm: NF does not spawn all jobs at once like other tools with the rationale that is a pipeline execution is not progressing, downstream tasks sit and just way for the previous tasks completion. But this also gives us issues with potentially quite large delays in pipelines.
An easy example: We have a pipeline that firsts submits a job that takes a while to finish (e.g. different variant callers) and at the end there is a quick concatenation of the output of the previous jobs. This last job only take a second or so to run. But while the first jobs were running other users submitted big jobs to the cluster and since NF did not spawn the short concatenation job from the beginning it will be placed last in the queue, delaying it until the queue has been cleared.
One can partially alleviate the problem by using different priorities and such, but in general the behaviour to not spawn jobs in the beginning would disrupt the FIFO idea of a Slurm queue since other tools submit all jobs at once and waits for the job manager to to do its job and prioritize the workload. For me the logical thing would be that you could choose to have the jobs submitted to Slurm from the beginning and that the potentially orphaned jobs that remains in the queue is handled in another manner, but maybe I am missing something of the larger picture as to why NF handles the jobs this way?
Rad Suchecki
@rsuchecki
Apr 11 13:08
Downstream process cannot be submitted without all its inputs.
For very quick tasks set executor to local
Rad Suchecki
@rsuchecki
Apr 11 13:15
This is quite fundamental imo to the data flow paradigm or at least to how it is implemented in NF. Without the inputs having been generated the required info just isn't there.
This is separate from submission rate settings so for independent jobs you should be able to flood the scheduler I that is actually desired
Rad Suchecki
@rsuchecki
Apr 11 13:28
@olavurmortensen my guess would be same output 2 channels but not really clear what the goal is
Ólavur Mortensen
@olavurmortensen
Apr 11 13:36
@rsuchecki I need to use the output from one process in two different processes. When these processes receive the input from the channel, the latter of the two will find that the channel is empty.
Tobias "Tobi" Schraink
@tobsecret
Apr 11 14:19
@rsuchecki I am a user of a cluster and I just keep my own install - it's light weight and the release cycle of NF is too fast for our sysadmins to keep up with - it's just easier to keep my own install. All it depends on is Java anyways. For dockerized workflows, I use the singularity install maintained by the sysadmins though. Oh and I have my own conda install, too.