These are chat archives for nextflow-io/nextflow

19th
Jul 2018
Rad Suchecki
@rsuchecki
Jul 19 2018 00:13 UTC
This message was deleted
Francesco Strozzi
@fstrozzi
Jul 19 2018 09:54 UTC
hello, I have the impression that when a run fails on AWS Batch, NF does not issue a terminate api call but instead only a cancel api call
this works fine to cancel the jobs in RUNNABLE state, but for jobs that are actually RUNNING a Terminate call is needed
Paolo Di Tommaso
@pditommaso
Jul 19 2018 09:55 UTC
is impression is good
nextflow-io/nextflow#782
the 0.31.0-SNAPSHOT does
Francesco Strozzi
@fstrozzi
Jul 19 2018 09:55 UTC
mmm I think this is another problem maybe

the 0.31.0-SNAPSHOT does

this is what I am using now, and I have experienced this problem few minutes ago. Now I have multiple copies of the same jobs running at the same time, because those that were part of the previously “failed” run where not terminated.

 Version: 0.31.0-SNAPSHOT build 4882
  Modified: 09-07-2018 13:33 UTC (15:33 CEST)
  System: Mac OS X 10.11.6
  Runtime: Groovy 2.4.15 on OpenJDK 64-Bit Server VM 1.8.0_121-b15
  Encoding: UTF-8 (UTF-8)
Paolo Di Tommaso
@pditommaso
Jul 19 2018 09:57 UTC
too old, it was change two days ago ..
update it with
CAPSULE_RESET=1 NXF_VER= 0.31.0-SNAPSHOT nextflow info
Francesco Strozzi
@fstrozzi
Jul 19 2018 09:58 UTC
:+1: I’ll give it a try thanks
QuentinLetourneur
@QuentinLetourneur
Jul 19 2018 14:17 UTC
hello,
QuentinLetourneur
@QuentinLetourneur
Jul 19 2018 14:22 UTC

First thanks for this nice workflow manager.
I'd like to know if something like that could be done in a configuration file ?

process {
    executor='slurm'
    queue = 'hubbioit'
    clusterOptions='--qos=hubbioit'
    errorStrategy = 'terminate'

    $filtering {
        container = "~/filtering.simg"
        errorStrategy { task.exitStatus == 143 ? 'retry' : 'terminate' }
        maxRetries = 2
        memory { 10.GB * task.attempt }
    }
}

With this i got the following error :

ERROR ~ Unable to parse config file: '/pasteur/projets/policy01/BioIT/quentin/scripts/Benchmark_binning/nextflow_slurm_singularity_common.config' 

  No signature of method: nextflow.util.MemoryUnit.multiply() is applicable for argument types: (groovy.util.ConfigObject) values: [[:]]
  Possible solutions: multiply(java.lang.Number)
Paolo Di Tommaso
@pditommaso
Jul 19 2018 14:23 UTC
better now ! :clap:
if this the problem is memory { 10.GB * task.attempt }
it should be memory = { 10.GB * task.attempt }
then the syntax $filtering is deprecated
QuentinLetourneur
@QuentinLetourneur
Jul 19 2018 14:27 UTC
ok i'll try i thought that it was either = or {}
Paolo Di Tommaso
@pditommaso
Jul 19 2018 14:27 UTC
consider using process selector
QuentinLetourneur
@QuentinLetourneur
Jul 19 2018 14:28 UTC
ok thanks you
Paolo Di Tommaso
@pditommaso
Jul 19 2018 14:28 UTC
welcome
QuentinLetourneur
@QuentinLetourneur
Jul 19 2018 14:47 UTC

I modified the config file :

process {
    executor='slurm'
    queue = 'hubbioit'
    clusterOptions='--qos=hubbioit'
    errorStrategy = 'terminate'

    withName: filtering {
        container = "~/filtering.simg"
        errorStrategy = { task.exitStatus == 143 ? 'retry' : 'terminate' }
        maxRetries = 2
        memory = { 10.GB * task.attempt }
    }
}

I didn't got : ERROR ~ Unable to parse config file
But when the process started I got the same error :

No signature of method: nextflow.util.MemoryUnit.multiply() is applicable for argument types: (groovy.util.ConfigObject) values: [[:]]
Possible solutions: multiply(java.lang.Number)
Paolo Di Tommaso
@pditommaso
Jul 19 2018 14:51 UTC
what version are you using ?
nextflow -version
QuentinLetourneur
@QuentinLetourneur
Jul 19 2018 14:51 UTC
version 0.26.3 build 4740
Paolo Di Tommaso
@pditommaso
Jul 19 2018 14:51 UTC
umm, old
update it
nextflow self-update
QuentinLetourneur
@QuentinLetourneur
Jul 19 2018 14:54 UTC
no improvement :/
Paolo Di Tommaso
@pditommaso
Jul 19 2018 14:55 UTC
just tried and it works
QuentinLetourneur
@QuentinLetourneur
Jul 19 2018 14:59 UTC
strange don't work for me (tried twice)
Paolo Di Tommaso
@pditommaso
Jul 19 2018 15:00 UTC
isolate the problem with a minimal script and config files and open an issue on GitHub including them
QuentinLetourneur
@QuentinLetourneur
Jul 19 2018 15:00 UTC
ok i'll do so
thanks again
Paolo Di Tommaso
@pditommaso
Jul 19 2018 15:01 UTC
welcome
QuentinLetourneur
@QuentinLetourneur
Jul 19 2018 15:11 UTC
I made a mistake it works for me too !
Paolo Di Tommaso
@pditommaso
Jul 19 2018 15:11 UTC
I was sure about that :wink:
Clément ZOTTI
@czotti
Jul 19 2018 18:47 UTC

Hi, I have a question regarding output files.

I use a pipeline to train some model, during the training my script save checkpoints of the model in a directory called checkpoints.

My main.nf is

process train_model {
    publishDir "results/"
    scratch true

    input:
    set sid, file('dataset') from training_input

    output:
    set sid, file('checkpoints') into training_output

    script:
    """
    model_train.py -bs 5, -epochs 50 --input ${dataset}
    """
}

Let's says I let my model train for two days and I want to kill it (see: Ctrl+C) but I also want the checkpoints generated by my script to be saved into the publishDir. Are there any directive to save my directory into the publishDir if a task is manually killed or crashed ?

Mike Smoot
@mes5k
Jul 19 2018 20:36 UTC
@czotti I believe publishDir is only populated once the output channel is populated, so I don't it will do what you want. However, your intermediate results should be still be in the work directory for the process, so you should be able to see them there. It might be an interesting feature to support for outputting intermediate results into the output channel, but I'm not sure how easy that would be or whether it would break the dataflow model.