These are chat archives for nextflow-io/nextflow

12th
Apr 2017
Matthieu Foll
@mfoll
Apr 12 2017 06:59
Well deserved!
Paolo Di Tommaso
@pditommaso
Apr 12 2017 07:14
:v:
Rickard Hammarén
@Hammarn
Apr 12 2017 07:39
Nicely done! 🍾
Paolo Di Tommaso
@pditommaso
Apr 12 2017 07:40
thanks
Félix C. Morency
@fmorency
Apr 12 2017 13:43
@pditommaso do you have a non paywalled version of the article? :D
Paolo Di Tommaso
@pditommaso
Apr 12 2017 13:44
Hi, I can send you privately
Félix C. Morency
@fmorency
Apr 12 2017 13:44
I would like that
Shawn Rynearson
@srynobio
Apr 12 2017 17:23
Quick question. Reading through your docs for launching on SLURM. I was wondering if other then adding the -ntask option (clusterOptions) via a config file, can one add multiple jobs to be launched per node?
Reason I ask is that the HPC center I'm using does not allow the -ntask option.
Paolo Di Tommaso
@pditommaso
Apr 12 2017 18:07
@srynobio NF delegates the scheduling to the underlying resource manager ie. SLURM in your case
more tasks will be executed in each node provided that there are enough resources w/o having to specify -ntasks option
at least that is the common behaviour with other batch scheduler (I'm not a slurm expert)
Félix C. Morency
@fmorency
Apr 12 2017 18:10
yes, this is what's happening with slurm too
Shawn Rynearson
@srynobio
Apr 12 2017 18:33

I'm sorry but I'm not understanding what this means: more tasks will be executed in each node provided that there are enough resources w/o having to specify -ntasks option is the based on how you write the process?

Simple example:

I have a fastqc process that:

process fastqc {

    input:
    file f from myFastqs

    when:
    f.name =~ /fastq$/

    cpus 2

    script:
    """
    fastqc --threads ${task.cpus} -f fastq $f
    """
}

And a config:

process {
    executor = 'slurm'
    clusterOptions = '--account=my-account --partition=my-partition
}

when I launch with the command
nextflow run my.nf -c my.config

each node get one single fastqc run.

Paolo Di Tommaso
@pditommaso
Apr 12 2017 18:36
cpus 2 should go before input:
but that's not the problem
I think SLURM can decide to allocate the tasks in different nodes depending the current cpus/nodes availability
Félix C. Morency
@fmorency
Apr 12 2017 18:43
it also depends on how your slurm nodes are configured
Shawn Rynearson
@srynobio
Apr 12 2017 18:45
It can, but node sharing or -ntasks is not something offered by most HPC center (TACC,etc). But it sound like this is not really a nextflow solvable issue. I just wanted to check if their was an option or command available, that I'd missed.
Paolo Di Tommaso
@pditommaso
Apr 12 2017 18:46
if this can be done with plain sbatch command, it can be done with NF as well.
NF creates a wrapper for each task and execute with sbatch, so there's no magic
I would suggest to ask to your sysadmin how to manage this use case
Shawn Rynearson
@srynobio
Apr 12 2017 18:51
I think I know the answer now, thanks for the help and the great tool.
Paolo Di Tommaso
@pditommaso
Apr 12 2017 18:52
nice, I'm happy you are finding it useful
note: you can replace --partition=my-partition with
process {
  executor = 'slurm'
  queue = 'my-partition'  
}
Félix C. Morency
@fmorency
Apr 12 2017 20:22
@pditommaso I read your paper and wondered if you tested the (docker) numerical analysis against different processor brand/arch. ie. Intel vs AMD, x86 vs other
Paolo Di Tommaso
@pditommaso
Apr 12 2017 20:28
well, we tested Intel iCore (Mac) vs Xeon (Linux)