These are chat archives for nextflow-io/nextflow

21st
Dec 2018
Stephen Ficklin
@spficklin
Dec 21 2018 04:05
Another question. My workflow downloads 26,000 data files. I have clean up working such that cached files are removed when they are no longer needed (and resume still works!). I allow the queue size on my cluster to be 100. What I would like to have happen is for all 100 jobs to run through the workflow to completion before the next 100 start. This way I don't overun my storage. Is this possible to do?
micans
@micans
Dec 21 2018 10:03
Different topic: total read count: -1868244849 ... I need to find a better Groovy int. Bigint? hold on ...
micans
@micans
Dec 21 2018 10:18
Well, toBigInteger sounds about right ...
Riccardo Giannico
@giannicorik_twitter
Dec 21 2018 10:52
Hi @pditommaso , I found this strange behaviour changing from v0.30.0 to v.18.10.1 (I have them both, in two different conda envs)
Launching this sbatch -n 1 nextflow run myflow.nf on the same .nf from the same user, same folder, same $HOME/.nextflow/configcontaining process.executor = 'slurm'
on conda nextflow_v0.30.0 I get "[warm up] executor > slurm"
while on conda nextflow_v.18.10.1 I oddly get "[warm up] executor > local"
How is that possible? Am I missing something? Isn't nextflow v.18 reading $HOME/.nextflow/config file?
Paolo Di Tommaso
@pditommaso
Dec 21 2018 12:25
Check your config with nextflow config command
deo999
@deo999
Dec 21 2018 13:00
hi
have you tested nextflow on dockerswarm and kubernetess
my measn you have provided benchmarks for nextflow on docker ,singularity
Riccardo Giannico
@giannicorik_twitter
Dec 21 2018 16:37

@pditommaso
from conda nextflow_v0.30.0

$ nextflow config
process {
   executor = 'slurm'
}

from conda nextflow_v18.10.1 I have nothing

$ nextflow config
$
Riccardo Giannico
@giannicorik_twitter
Dec 21 2018 16:59

and of course from both conda envs:

$ cat $HOME/.nextflow/config
process.executor = 'slurm'

I also tried deleting the env nextflow_v18.10.1 and creating it again, but the issue is still happening

Stephen Kelly
@stevekm
Dec 21 2018 19:08

@spficklin

I'm trying to work on a cleanup process that removes unwanted files.

Not sure exactly what the context of the cleanup is but I accomplish this with some Makefile wrappers here: https://github.com/NYU-Molecular-Pathology/NGS580-nf/blob/08db26c0cd8afe32e08363adf282244733d10b77/Makefile#L390

Basically, I resolve all symlinks in the 'publishDir', then I remove all extraneous 'work' subdirs (from previous runs of the pipeline), then I save a list of all files that were in each work subdir and then attempt to replace all files with empty file-stubs.

I think some of this might already be integrated into Nextflow though but I've never actually explored it. I just wanted to remove all the major space-consuming items while keeping a record of what was produced. Note however that this does create problems with the 'resume' functionality, because it can sometimes think that old tasks are resumable when in fact the files have been destroyed (a stub is still there so it looks like the files still exist)

so like if I try to re-run a pipeline with '-resume' but I forgot that I 'finalized' it as per above, it will see my empty fastq files from my first pipeline step and try to run Trimmomatic on them, resulting in errors.
Stephen Ficklin
@spficklin
Dec 21 2018 19:10
Hi @stevekm . I took a very similar approach. I pushed all files that I wanted to cleanup into a channel that I convert to a value so that I get the full path. Then in a cleanup process I convert the file to a sparse file (it's actually size 0 put reports the original size) and set the modify/access times to be the same as before. I an resume just fine, but I too have the problem if I ever do want to repeat a step I have to be cognizant that I removed the files.
Stephen Kelly
@stevekm
Dec 21 2018 19:13
@giannicorik_twitter I came up with a fun little SLURM 'auto queue picker' to choose the best SLURM queue to run on, and then pass that as a parameter to the NF script to configure the process; example here:
https://github.com/NYU-Molecular-Pathology/NGS580-nf/blob/08db26c0cd8afe32e08363adf282244733d10b77/Makefile#L234
it first tries to pick the queue with the most idle nodes, then falls back to the queue with the most mixed nodes. Maybe convenient for you, it helps me with load balancing on our system so I dont flood any one partition too much :)
@spficklin that is interesting, I will have to consider it. I've got a 'clean-all' step I use to basically nuke all the Nextflow output if I want to start over from scratch, though maybe I should have one that preserves the publishDir instead..
Stephen Ficklin
@spficklin
Dec 21 2018 19:18
@stevekm the approach I take cleans up the work directory and we have flags to turn on/off deletion in the config file. So if someone turns on automatic deletion the file in the work directory gets removed and the file is not added to the PublishDir either so both are taken care of at once.
we change the publishDir pattern depending on how that flag is set.
So, cleanup happens while the workflow is running.
Stephen Kelly
@stevekm
Dec 21 2018 19:22

@pditommaso in my nextflow.config I am trying to set the process memory amount to be defined dynamically based on the number of CPUs e.g. 8GB * #CPUs, with a default CPUs of 2 and then some processes will have more. I was hoping that using the dynamic directives I could just set the mem amount once and have it auto-scale for each process.

process {
    cpus = 2
    mem = { 8.GB * process.cpus }

    withName: sambamba {
                cpus = 8
            }
}

^ so in this all processes would default to 8GB memory & 2 CPUs, but Sambamba process would end up with 64GB memory and 8 CPUs. Does this work? I cannot tell, because my 'mem' directive is not showing up in the '#SBATCH' like at the top of my .command.run scripts with the SLURM executor; was also wondering about why that is.

I saw mentioned in the archives here: https://gitter.im/nextflow-io/nextflow/archives/2016/11/15
that closure-based directives in the nextflow.config have a deferred evaluation but I was not sure how to get this to work in this case
Stephen Kelly
@stevekm
Dec 21 2018 19:32

yeah when I use this process config:

process.executor = 'slurm'
process.queue = "${params.queue}"
process.clusterOptions = '--ntasks-per-node=1 --export=NONE --export=NTHREADS --mem-bind=local --nice=10'

process {
            errorStrategy = "retry" 
            maxRetries = 1 
            cpus = 2 
            time = '4h'
            mem = { 8.GB * process.cpus }
            scratch = true
}

I get .command.run SBATCH scripts that look like this:

#!/bin/bash
#SBATCH -D /gpfs/data/molecpathlab/production/NGS580/170602_NB501073_0012_AHCKYCBGX2_test/work/ab/1ac0982100d7de139bac25efd88e1b
#SBATCH -J nf-fastq_merge_(SampleID)
#SBATCH -o /gpfs/data/molecpathlab/production/NGS580/170602_NB501073_0012_AHCKYCBGX2_test/work/ab/1ac0982100d7de139bac25efd88e1b/.command.log
#SBATCH --no-requeue
#SBATCH -c 2
#SBATCH -t 04:00:00
#SBATCH -p gpu4_medium
#SBATCH --ntasks-per-node=1 --export=NONE --export=NTHREADS --mem-bind=local --nice=10

^ There is no mem argument listed in here. Any idea why its not showing up ??

its the same if I use mem = 8.GB and mem = { 8.GB * task.cpus } as well
Stephen Kelly
@stevekm
Dec 21 2018 20:02

looks like what I really want is the --mem-per-cpu arg here:

https://slurm.schedmd.com/sbatch.html

but its more awkward to set dynamically without using the Nextflow process configs