These are chat archives for nextflow-io/nextflow

17th
Feb 2017
Mike Smoot
@mes5k
Feb 17 2017 00:06
Hi @pditommaso, I'm experimenting with a slurm cluster and am wondering if I can see the sbatch command used to submit the tasks? I don't see them in work or .nextflow.log
Maybe a trace flag I can enable somewhere?
Phil Ewels
@ewels
Feb 17 2017 04:29
Hi Mike - have a look at the files in the work directories, each directory will have a hidden file which is the sbatch script used to launch the job..
Paolo Di Tommaso
@pditommaso
Feb 17 2017 09:07
You can find it here
You can even enable the tracing in the log file but would not produce any useful info because as you can see, it just sbatch .command.run
as mentioned by Phil (welcome back!) SLURM options are defined in the .command.run wrapper file in each task work dir
LukeGoodsell
@LukeGoodsell
Feb 17 2017 13:34
Hey @pditommaso, is it possible to have a shell or script block that is kept as-is in the .command.sh (ie, no spaces added at the beginning of lines), without using a template?
I ask because I’ve got a small process that just needs to output a block of text in a particular format using a heredoc, and it would be unfortunate if I need to write a separate file just for that.
Sorry, ignore that
My mistake, it alread is kept as-is
Paolo Di Tommaso
@pditommaso
Feb 17 2017 13:35
:+1:
LukeGoodsell
@LukeGoodsell
Feb 17 2017 15:35

Hi @pditommaso, I have a process that takes all the items in a channel and I’d like to use an exec block to set the items as a list item in a hash map. E.g.:

process someProcess {

    input:
    file allFiles from fileChannel.toList()

    exec:
    def options = [:]
    options.allFiles = allFiles
}

I’m new to Groovy, and I can’t see how to make it work. Any suggestions?

Paolo Di Tommaso
@pditommaso
Feb 17 2017 15:48
there's something messing, how would you transform a list to a map ?
LukeGoodsell
@LukeGoodsell
Feb 17 2017 15:49
options is the map. I’ld like options.allFiles to be a list of the file paths
Paolo Di Tommaso
@pditommaso
Feb 17 2017 15:49
ah
ok, however you don't need a process to do that
just
fileChannel.toList().map { [allFiles: it] }
don't confuse map with map as data structure .. :)
does it make sense ?
LukeGoodsell
@LukeGoodsell
Feb 17 2017 15:57
Not quite. I’m preparing a map structure that will be written to a YAML file, and I’d like one of keys to point to a list of all the filepaths coming out of another channel. I’ll have other values to set as well, and I want it all to be complete before passing the output file to the next process, hence why I’m putting it in a process.
I cant see how to make it do what I’m intending...
Let me mock up a more detailed example
Paolo Di Tommaso
@pditommaso
Feb 17 2017 16:01
I see
LukeGoodsell
@LukeGoodsell
Feb 17 2017 16:02
import org.yaml.snakeyaml.Yaml
import org.yaml.snakeyaml.DumperOptions

process precursor {
    output:
    file '*.txt' into fileChannel

    script:
    """
    for num in \$seq(1 10); do echo "asdf" > "\${num}.txt"; done
    """
}

process nextTaskParams {

    input:
    file allFiles from fileChannel.toList()

    output:
    file "nextTaskParams.yaml" into nextTaskParamsFile

    exec:
    nextTaskParamsFilePath = new File("${task.workDir}/nextTaskParams.yaml")
    DumperOptions options = new DumperOptions()
    options.setPrettyFlow(true)
    options.setDefaultFlowStyle(DumperOptions.FlowStyle.BLOCK)
    yaml = new Yaml(options)

    def options = [:]
    options.allFiles = allFiles
    options.other = ...

    yaml.dump(options, new FileWriter(nextTaskParamsFilePath))
    """
}

proces nextTask {
    input:
    file "nextTaskParams.yaml" from nextTaskParamsFile

    ...
}
Paolo Di Tommaso
@pditommaso
Feb 17 2017 16:04
One thing at time, this should work. What exactly is the problem ?
LukeGoodsell
@LukeGoodsell
Feb 17 2017 16:06
I get this:
allFiles: !!sun.nio.fs.UnixPath {
  }
in the output yaml file
Paolo Di Tommaso
@pditommaso
Feb 17 2017 16:06
ahh
because files are represented as Java Path object, not at string
replace this
options.allFiles = allFiles
with
options.allFiles = allFiles.collect { it.toString() }
then should work
LukeGoodsell
@LukeGoodsell
Feb 17 2017 16:08
Close but not quite. I get:
allFiles:
- input.1
I see that in work/tmp/xx/xxxx../ there’s a file called input.1, and it contains a list of the file paths
Paolo Di Tommaso
@pditommaso
Feb 17 2017 16:11
of for this refactor your code without using a process, for example:
fileChannel.toList().map {

    def nextTaskParamsFilePath = new File("${task.workDir}/nextTaskParams.yaml")
    DumperOptions options = new DumperOptions()
    options.setPrettyFlow(true)
    options.setDefaultFlowStyle(DumperOptions.FlowStyle.BLOCK)
    def yaml = new Yaml(options)

    def options = [:]
    options.allFiles = allFiles.collect { it.toString() } 
    options.other = ...

    yaml.dump(options, new FileWriter(nextTaskParamsFilePath))
}
.set { nextTaskParamsFile }
process is meant to be applied to multiple inputs and it creates its own working dir
LukeGoodsell
@LukeGoodsell
Feb 17 2017 16:14
And if I have another channel feeding into it?
Paolo Di Tommaso
@pditommaso
Feb 17 2017 16:16
you have other files to be stored in yaml ?
LukeGoodsell
@LukeGoodsell
Feb 17 2017 16:17
Yes. I have a downstream program that takes a yaml configuration file, which contains multiple lists of files from previous parallel processes.
Paolo Di Tommaso
@pditommaso
Feb 17 2017 16:19
OK. in this case process it's a better choice
to solve your problem with the file names
change the input declaration from file to val
at the end you don't need to access them for reading
LukeGoodsell
@LukeGoodsell
Feb 17 2017 16:21
Ok. I’m afraid I have to go AFK. I’ll look into that.
Thanks for your help
Paolo Di Tommaso
@pditommaso
Feb 17 2017 16:21
ok
Mike Smoot
@mes5k
Feb 17 2017 17:23
Thanks @pditommaso and @ewels for the pointers on sbatch. I'd been grepping for sbatch (all lowercase) and couldn't find anything in the work dir, which is why I was confused. If I'm understanding how sbatch works it treats the lines in .command.run prefixed with #SBATCH as if those were command line arguments, correct?
Paolo Di Tommaso
@pditommaso
Feb 17 2017 17:51
exactly
Mike Smoot
@mes5k
Feb 17 2017 18:07
Cool, thanks. Do you happen to remember why the --no-requeue option is hard coded for slurm? I'm wondering if that might impact running a cluster in AWS with spot pricing instead of on-demand.
Paolo Di Tommaso
@pditommaso
Feb 17 2017 18:12
because job re-execution needs to be managed by NF not by the slurm scheduler
Mike Smoot
@mes5k
Feb 17 2017 18:14
That's what I figured. I was wondering if I'd be able to get the same magic restart spot pricing behavior with cfncluster as exists for nextflow cloud
Paolo Di Tommaso
@pditommaso
Feb 17 2017 18:16
do you mean jobs stopped by retired spot instances ?
Mike Smoot
@mes5k
Feb 17 2017 18:16
Yes
Paolo Di Tommaso
@pditommaso
Feb 17 2017 18:17
nope, that's implemented by the NF cloud scheduler
but should work more or less the same as long you set an retry errorStrategy
Mike Smoot
@mes5k
Feb 17 2017 18:18
I think managing things with errorStrategy should be fine.
Paolo Di Tommaso
@pditommaso
Feb 17 2017 18:19
what's the advantage of cfncluster over NF cloud ?
Mike Smoot
@mes5k
Feb 17 2017 18:20
Allowing multiple instances of nextflow to run simultaneously.
Paolo Di Tommaso
@pditommaso
Feb 17 2017 18:21
:+1:
Mike Smoot
@mes5k
Feb 17 2017 18:23
The goal it to have a production cluster that can handle multiple requests to run pipelines. The architecture I'm currently aiming for is a celery queue that manage requests to run nextflow pipelines, then one or more clusters running celery workers that spawn nextflow pipelines, and then the slurm queues on the individual clusters managing the resources of that cluster.
Paolo Di Tommaso
@pditommaso
Feb 17 2017 18:25
I see, makes sense
Mike Smoot
@mes5k
Feb 17 2017 18:25
only about a million moving parts... :)
Paolo Di Tommaso
@pditommaso
Feb 17 2017 18:25
exactly :)
Félix C. Morency
@fmorency
Feb 17 2017 19:19
anyone using maas here?
i was thinking on using a nf cloud-like, but for on-premise hardware
Paolo Di Tommaso
@pditommaso
Feb 17 2017 19:31
what's maas ?
Félix C. Morency
@fmorency
Feb 17 2017 19:35
Metal as a Service
Paolo Di Tommaso
@pditommaso
Feb 17 2017 19:37
uh !
we use NF on a UGE on-premises cluster
Félix C. Morency
@fmorency
Feb 17 2017 19:46
The idea (not sure if it's a good one) is to acquire/deploy/release an ignite cluster each time someone needs to launch processing. Just like you launch an instance in the cloud, but on-premise metal
Paolo Di Tommaso
@pditommaso
Feb 17 2017 19:47
could be interesting, I would go more for a standard batch scheduler in this scenario
Félix C. Morency
@fmorency
Feb 17 2017 20:16
Right. I'll play with JuJu and slurm and see how far I get
amacbride
@amacbride
Feb 17 2017 20:28
@pditommaso @mes5k One thing to note about SLURM (that I just found out recently), is that the usage data (mem, cpu) is only tracked and available from the SLURM accounting system if srun is used to launch a task, not sbatch. An sbatch script that then uses srun to launch the task will work.
I was just about to go looking through the NF code to see if it would be possible to do it -- I dislike the additional level of indirection, but I would love to see the detailed memory usage stats.
Trevor Tanner
@tantrev
Feb 17 2017 20:35
@amacbride when I use --with-trace with NF + Slurm, it seems to capture all those stats?
Félix C. Morency
@fmorency
Feb 17 2017 20:37
What guide/blog/article on slurm deployment would you guys recommend?
Trevor Tanner
@tantrev
Feb 17 2017 20:39
It has some shortcomings (like I haven't tried it simultaneously with lustre) but I've used elasticluster to deploy a relatively pain-free SLURM cluster
Félix C. Morency
@fmorency
Feb 17 2017 20:40
@tantrev on-premise or in the cloud?
No idea though if you meant local. But if you use GCE, I would recommend dockerflow instead of SLURM
my apologies, that was a Gridengine cluster, not SLURM (I could never get SLURM to work with elasticluster and GCE)
Félix C. Morency
@fmorency
Feb 17 2017 20:45
oh okay
amacbride
@amacbride
Feb 17 2017 20:46
@tantrev I'm talking about the SLURM-internal accounting, not the NF flavor (which is also quite useful but in a different way)
sacct, etc.
Mike Smoot
@mes5k
Feb 17 2017 20:46
I've been able to get SLURM working in AWS with cfncluster, although I'm now struggling with getting it SLURM properly configured so that it actually sees and consumes all of the resources it's given...
Félix C. Morency
@fmorency
Feb 17 2017 20:48
doesn't make me want to go the slurm/batch scheduler way ;) I had lot of success with the built-in ignite support. might stick to that for the moment
amacbride
@amacbride
Feb 17 2017 20:50
@mes5k I'll have to look at cfncluster again; I wasn't thrilled with it when I tried it last year, and ended up just using Ansible to provision my AWS SLURM cluster.
Mike Smoot
@mes5k
Feb 17 2017 20:51
@fmorency I'm using SLURM because I'd like to be able to run multiple instances of nextflow at once, something ignite isn't apparently great at. I'd also like to be able to send specific tasks to specific queues (e.g. blast jobs to machines primed for blasting).
@amacbride I'm not sure I'm thrilled with cfncluster either and will probably go the ansible route eventually myself. For now it at least gets me going.
Trevor Tanner
@tantrev
Feb 17 2017 21:03
@pditommaso I think I may have encountered some sort of NF stream leak. It's kind of hard to supply a test case b/c the data I'm using is ~700GB & replicating it seems to take the whole dataset. But the work folder of a problem command is here and my NF script is here. Basically when I have something like string = "echo ${var1} ${var1}", I'm getting two different variables put in for var1.
Trevor Tanner
@tantrev
Feb 17 2017 21:11
An explicit example of the issue may also be seen here
Paolo Di Tommaso
@pditommaso
Feb 17 2017 21:28
@amacbride really sbatch doesn't report accounting information, quite strage
however srun is not an option because it's a blocking command
@tantrev for any NF error please report it as a GH issue, gitter is a mess for that
Trevor Tanner
@tantrev
Feb 17 2017 21:31
@pditommaso will do, sorry about tha
that*
Paolo Di Tommaso
@pditommaso
Feb 17 2017 21:32
no problem! you are welcome
amacbride
@amacbride
Feb 17 2017 21:32
Right -- from my investigations, it would require yet another script to wrap the NFgenerated one, run by srun, run by sbatch. And that's just too ugly.
Paolo Di Tommaso
@pditommaso
Feb 17 2017 21:33
yes, it's already too complicated ;)
IMO all batch schedulers suck when deployed in the cloud
Félix C. Morency
@fmorency
Feb 17 2017 21:36
@mes5k got it. I'm still new to all this config/deployment world. We have some on-premise hardware we want to use
still exploring what the best option is