These are chat archives for nextflow-io/nextflow

28th
May 2017
Robert Syme
@robsyme
May 28 2017 03:34
Hi all. I'm getting an odd error at the end of my .command.err files.
Each process finishes by printing to stderr something like:
whatever/work/02/0f090cf0511418a9459d66513b77a7/.command.run.1: line 99: 47616 Terminated nxf_trace "$pid" .command.trace
The job seems to finish without too much trouble and .exitcode is 0, but is this normal behaviour?
amacbride
@amacbride
May 28 2017 05:06
@robsyme As far as I know, yes.
It's cleaning up the tracing function as the script itself terminates -- it's what's writing the .command.trace file that tracks cpu, vm usage, etc.
@pditommaso Question for you: I was trying to get the value of the process.queue property from within my script -- I'd like to dynamically construct the input to the queue setting, so that I can have some use the default queue (which might change names depending on the environment), or the high-priority queue, which is set up as ${process.queue}-hi
Is that possible?
So for example, most things might use the "sierra" partition, but some could be marked as "sierra-hi", or "mountain" and "mountain-hi", etc.
amacbride
@amacbride
May 28 2017 05:13

I tried

queue "${process.queue}-hi"

but got:

Cannot get property 'queue' on null object
Robert Syme
@robsyme
May 28 2017 12:07
Phew. Thanks @amacbride.
Paolo Di Tommaso
@pditommaso
May 28 2017 16:25
@robsyme nope, that's just fine. unfortunately I can't manage a way to silent that message.
@amacbride do you mean from within a process ? You should use task.queue
amacbride
@amacbride
May 28 2017 17:00

@pditommaso I was asking about the process.queue because it looks like if you define queue in terms of task.queue, it ends up in loop and overflows:

May-28 09:56:59.446 [Actor Thread 12] ERROR nextflow.processor.TaskProcessor - Execution aborted due to an unexpected error
java.lang.StackOverflowError: null 
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:803)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:442)

...which makes sense.

The behavior I'd like is to set the default queue in the config file, but then be able to override it on a per-process basis by means of deriving a new value based on the default. (If that makes sense.)
Paolo Di Tommaso
@pditommaso
May 28 2017 17:02
oh, recursive defintion ! :grin:
amacbride
@amacbride
May 28 2017 17:02
exactly
Paolo Di Tommaso
@pditommaso
May 28 2017 17:03
just use a params for that, eg
params.defQueue = 'foo'

process.queue = { condition ? "${params.defQueue}-hi" : params.defQueue }
amacbride
@amacbride
May 28 2017 17:09

Well, I half-followed that. :)

What would I then put in a process definition that I want to use the "-hi" queue?

Paolo Di Tommaso
@pditommaso
May 28 2017 17:10
ah, because you want the condition depending the actual process ?
amacbride
@amacbride
May 28 2017 17:13
Yes. I have some processes that are bottlenecks, and what I'm seeing in certain runs is that because everything is currently being scheduled "fairly", I end up with one or two samples that don't get to their bottleneck process until very late: so the cluster mostly sits idle while one straggler catches up.
I think if I boost the priority of a few key bottleneck processes, I should get much better overall throughput and a more "compact" runtime topology.
Paolo Di Tommaso
@pditommaso
May 28 2017 17:14
would not be easier something like that
amacbride
@amacbride
May 28 2017 17:15
?
Paolo Di Tommaso
@pditommaso
May 28 2017 17:15
params.defQueue = 'plain-queue'

process.queue = params.defQueue
process.$foo.queue = "${params.defQueue}-hi"
process.$bar.queue = "${params.defQueue}-hi"
amacbride
@amacbride
May 28 2017 17:17

Yes, I could do that, but then it moves the queue definition out of the main script and into the config file, and it's easier for those to get out of sync -- I'd prefer to do it as an override in the process definition itself, so that it's clear that it's a special case.

(But that's just my quirk, your solution will work, so I will try it!) Thanks!

Paolo Di Tommaso
@pditommaso
May 28 2017 17:19
if you want a dynamic solution you can use a custom directive to parametrize the queue definition
amacbride
@amacbride
May 28 2017 17:23
I've never used ext -- could you point me to an example that uses it?
Paolo Di Tommaso
@pditommaso
May 28 2017 17:25
having the queue generic definition in the config file would be fine ?
amacbride
@amacbride
May 28 2017 17:30

Yes -- would it be possible to try something like:

process {
    executor = "slurm"
    queue = "default"
    scratch = true
    cleanup = true
}

params.fastQueue = ${process.queue}-hi

and then in the process definition do:

process mustGoFast {
    queue = ${params.fastQueue}
}
Paolo Di Tommaso
@pditommaso
May 28 2017 17:31
missing a piece ?
amacbride
@amacbride
May 28 2017 17:31
(fat fingers!)
Paolo Di Tommaso
@pditommaso
May 28 2017 17:31
:)
amacbride
@amacbride
May 28 2017 17:32
(plus, my VPN dropped at that specific instant)
Paolo Di Tommaso
@pditommaso
May 28 2017 17:32
nope, but you write that as
params.queue = 'something'
params.fastQueue = ${process.queue}-hi

process {
    executor = "slurm"
    queue = "default"
    scratch = true
    cleanup = true
}
then using params.fastQueue in the main script
instead my idea would be something like this
params.defQueue = 'default'
process {
    executor = "slurm"
    queue = { task.ext.queueRule ? "${params.defQueue}-hi" : params.defQueue } 
    scratch = true
    cleanup = true
}
then in the main script something like
process foo {
  ext.queueRule { /*condition to select default or fast queue */  }
  script: 
  """
  command_as_usual
  """
}
does make sense ?
amacbride
@amacbride
May 28 2017 17:43

That does make sense, but is overkill for what I need -- the processes that need to go fast are static, so I can just do this:

process {
    executor = "slurm"
    queue = "duo"
    scratch = true
    cleanup = true
}

params.fastQueue = "${process.queue}-hi"

and then annotate the processes in question with the fastQueue directive.

(But I may need the dynamic thing at some point, so I will take note. I just tried the above and it worked fine.

To be concrete, I want to increase the chances of all of my indel recalibration steps for all 16 samples completing before some of the downstream variant detection work -- otherwise, I can get a straggler, and it increases my overall runtime by several hours. Annoying in terms of runtime, and expensive in terms of AWS.
dataflow, but with a hint :)
Paolo Di Tommaso
@pditommaso
May 28 2017 17:46
:+1:
amacbride
@amacbride
May 28 2017 17:46
Thanks for your help, now I'll go see if it actually works!
Paolo Di Tommaso
@pditommaso
May 28 2017 17:47
still prefer this version
params.queue = 'default'
params.fastQueue = ${params.queue}-hi

process {
    executor = "slurm"
    queue = params.queue 
    scratch = true
    cleanup = true
}
amacbride
@amacbride
May 28 2017 17:54
Yes, will do that if I ever need to parameterize the base queue name.
Paolo Di Tommaso
@pditommaso
May 28 2017 17:55
:+1:
amacbride
@amacbride
May 28 2017 17:58
(in case it's of interest) This diagram is for a single sample going through the pipeline, and I'm running 16 at a time. As you can see, there's a bottleneck near the middle -- by making sure that step is complete across all 16 samples before starting the bushier part ofd the graph later on, I should get much better cluster utilization. (This is with SLURM's preemptible queues, so it's really a fast lane that temporarily suspends other tasks if a high-priority one comes along.)
Paolo Di Tommaso
@pditommaso
May 28 2017 17:59
I see, interesting
amacbride
@amacbride
May 28 2017 17:59
(trying to figure out how to paste an image)
Paolo Di Tommaso
@pditommaso
May 28 2017 18:00
you have to upload somewhere and paste the link
amacbride
@amacbride
May 28 2017 18:01
Meh.
Paolo Di Tommaso
@pditommaso
May 28 2017 18:01
:)
dropbox ?
This message was deleted
amacbride
@amacbride
May 28 2017 18:10
Well, we can all just use our imagination, instead. :)
Paolo Di Tommaso
@pditommaso
May 28 2017 18:15
ahah, sure