These are chat archives for nextflow-io/nextflow

5th
Aug 2016
Mokok
@Mokok
Aug 05 2016 08:50
Hi, i found something quite disturbing about task tracing :
Paolo Di Tommaso
@pditommaso
Aug 05 2016 08:51
?
Mokok
@Mokok
Aug 05 2016 08:56

if i launch a task with a list as input, it will run a task/list-entry
and if there is some failing task, it cancel the execution

in this case, all the task appear to be submitted, but here is the strange thing:
before-fail are marked as completed
1 is failed
maybe some are completed
and all the last are marked as aborted.

and with resume only failed taskS are re-run (even the task that should have failed, but were marked as 'aborted')

Paolo Di Tommaso
@pditommaso
Aug 05 2016 08:58
not clear, please produce an example
Mokok
@Mokok
Aug 05 2016 09:01
the following script fails on value 15 and 214 because the executed cmd is trying to remove a non-existing file:

!/user/bin/env nextflow

  2
  3 source = Channel.from([0,0,0,,15,,0,0,214,0,0])
  4
  5 process listWatcher {
  6         echo true
  7
  8         input:
  9         val n from source
 10
 11         output:
 12         val n into res
 13
 14         script:
 15         """
 16         echo "n : $n"
 17         """
 18 }
 19
 20 process exitNormalOrErrorAndRetryToCorrect{
 21         echo true
 22
 23         input:
 24         val r from res
 25
 26         script:
 27         switch ( r ) {
 28                 case 15 :
 29                         exitVal = 'rm /path/to/somethingDoesNotExist.nop'
 30                         break
 31                 case 214 :
 32                         exitVal = 'rm /path/to/somethingDoesNotExist2.nop'
 33                         break
 34                 case 0 :
 35                         exitVal = 'echo done'
 36                         break
 37                 default :
 38                         exitVal = 'exit 1'
 39                         break
 40         }
 41         """
 42         sleep 3
 43         echo "`date +%T` | exit code : $r -> $exitVal"
 44         eval $exitVal
 45         """
 46 }
below u can see the trace file:
never mind it's ugly, i'll give it a better look
all listWatcher task are completed
then there are 6 completed 'exitNormalOrErrorAndRetryToCorrect' task
1 fail, 1 completed, and all the others are aborted
Paolo Di Tommaso
@pditommaso
Aug 05 2016 09:03
so?
Mokok
@Mokok
Aug 05 2016 09:04
the fact is if i 'touch' the 2 needed files that should generate the 2 errors when they don't exist and -resume the script
only 2 task are re-run, the 2 'rm [..]'
Paolo Di Tommaso
@pditommaso
Aug 05 2016 09:05
how do u see they are aborted?
Mokok
@Mokok
Aug 05 2016 09:05
so nextflow must have evaluated all the task to know which one have failed or not....then why the 'aborted' task doesn't produce anything in their 'work' directory ?
the trace file (-with-trace)
Paolo Di Tommaso
@pditommaso
Aug 05 2016 09:08
but your exitNormalOrErrorAndRetryToCorrect is not producing any output? why you are expecting something it the workdir?
Mokok
@Mokok
Aug 05 2016 09:08
at least the .command.out , err, exitcode
Paolo Di Tommaso
@pditommaso
Aug 05 2016 09:09
the exit .exitcode there must be
otherwise the task would be ex-executed
Mokok
@Mokok
Aug 05 2016 09:10
that's what i find strange ^^
Paolo Di Tommaso
@pditommaso
Aug 05 2016 09:11
if you are sure open an issue including an example to replicate the problem
Mokok
@Mokok
Aug 05 2016 09:12
on -resume, nexflow only runs the tasks corresponding to entries value 15 and 214 (where 214 is part of the aborted), that's what i find strange :/
i'll investigate with a clean example, to be sure. But preferred to ask you first in case it was known
Mokok
@Mokok
Aug 05 2016 11:08
ok, that's not an 'issue', i guess it's due to async' between NF and Torque, i guess this appends :
NF traces only the received results. So, if some task are still running on Torque when the error cause the NF interruption, NF isn't able to log it properly. However the tasks finally end and produce results. When NF -resume, it takes note of all produced results (including the latecomers) and traces a fewer number of tasks than the number of precedent aborted/fail tasks (this number may be Torque's queue size)
(the process run as it should, only the trace is impacted)
/end
Tiffany Delhomme
@tdelhomme
Aug 05 2016 13:40
This message was deleted
Hi Paolo,
I ran a nextflow pipeline with 3 processes, first one is ok, and in the second one, I have a Successfully completed. status in the log but the process is Re-submitted, any idea of what is happening?
Tiffany Delhomme
@tdelhomme
Aug 05 2016 13:45
Also I have this line in the .command.errfile:
/mnt/beegfs/delhommet/needlestack_callings/annotate_GATK_vcf/myeloma/work/5c/4827d915bf4305260e8345e3917fc8/.command.run.1: line 90:  9321 Terminated              nxf_trace "$pid" .command.trace
and this pipeline was tested on a part of the same input file without any problem...
Paolo Di Tommaso
@pditommaso
Aug 05 2016 14:15
@tdelhomme may it be that the process was re-execute for a temporary failure?
Tiffany Delhomme
@tdelhomme
Aug 05 2016 14:26
@pditommaso what is a temporary failure? this means the process is waiting too much time?
Paolo Di Tommaso
@pditommaso
Aug 05 2016 14:28
I have too little information, but from what you are saying it seems that the first the the task failed, it was re-execute, and it completed successfully
Tiffany Delhomme
@tdelhomme
Aug 05 2016 14:33

no actually the task is successfully completed in its .command.log, I also checked for the output, correct, but this task is re-exectued after that

for example, I have this in the .nextflow.log of the pipeline:

ug-05 16:30:18.686 [pool-1-thread-7] DEBUG nextflow.executor.GridTaskHandler - Launching process > annotate_vcf (97) -- work folder: /mnt/beegfs/delhommet/needlestack_callings/annotate_GATK_vcf/myeloma/$
Aug-05 16:30:18.793 [pool-1-thread-7] INFO  nextflow.processor.TaskDispatcher - [7c/7925f9] Re-submitted process > annotate_vcf (97)

and in the .command.log of the previous (97) I have successfully completed

it seems to be re-executed after a successfull completed, without any error
Paolo Di Tommaso
@pditommaso
Aug 05 2016 14:36
does this happen systematically? just once?
the work folder appears a bit strange
/mnt/beegfs/delhommet/needlestack_callings/annotate_GATK_vcf/myeloma/$
what's that ending $ ?
Tiffany Delhomme
@tdelhomme
Aug 05 2016 14:38
no this happens for each instance of this process
yes sorry it was truncated by the copy paste
` Aug-05 16:30:18.686 [pool-1-thread-7] DEBUG nextflow.executor.GridTaskHandler - Launching process > annotate_vcf (97) -- work folder: /mnt/beegfs/delhommet/needlestack_callings/annotate_GATK_vcf/myeloma/work/7c/7925f92a85fe53aa63eba40294cd1e
Paolo Di Tommaso
@pditommaso
Aug 05 2016 14:44
is this the pipeline right?
which step is reporting this problem?
Paolo Di Tommaso
@pditommaso
Aug 05 2016 14:49
not sure that's the problem, but what's this syntax?
!{!params.no_plots}
I mean it should be !{var-name}
why there's a exclamative mark after the { ?
(in more than a variable in the same line ..)
Tiffany Delhomme
@tdelhomme
Aug 05 2016 14:52
to take the opposite, if this params.no_plots is true I want to to send false but I've already tested it...
Paolo Di Tommaso
@pditommaso
Aug 05 2016 14:53
ah
:)
Tiffany Delhomme
@tdelhomme
Aug 05 2016 14:53
ouf ^^
Paolo Di Tommaso
@pditommaso
Aug 05 2016 14:55
Is there the file .exitcode in the folder?
/mnt/beegfs/delhommet/needlestack_callings/annotate_GATK_vcf/myeloma/work/7c/7925f92a85fe53aa63eba40294cd1e
Tiffany Delhomme
@tdelhomme
Aug 05 2016 14:56
nop
but this is the folder of the re-submitted, I will check for the folder of the first one
it contains 0
Paolo Di Tommaso
@pditommaso
Aug 05 2016 15:01
if that file does not exist it's very likely that cluster has killed it
Tiffany Delhomme
@tdelhomme
Aug 05 2016 15:03
Does it have to exist in the folder of the re-submitted process??
I have to leave... I think I will kill remaining processes, update the nextflow version and relaunch for the next 2 days with more split to deal with smaller files, what do you think about it?
Mokok
@Mokok
Aug 05 2016 15:03
This message was deleted
Paolo Di Tommaso
@pditommaso
Aug 05 2016 15:04
in the trace you there are the job ID given by LSF
you can use it to understand how LSF terminated that job
Tiffany Delhomme
@tdelhomme
Aug 05 2016 15:07
Ok i will check that so, thanks again
Paolo Di Tommaso
@pditommaso
Aug 05 2016 15:07
welcome !
how is temperature in Lyon ?
:)
Tiffany Delhomme
@tdelhomme
Aug 05 2016 15:10
not so good :worried: and for you? Spain, not so bad I guess :smile:
Paolo Di Tommaso
@pditommaso
Aug 05 2016 15:11
enjoy the week-end
Tiffany Delhomme
@tdelhomme
Aug 05 2016 15:14
lucky guy :+1:
same for you, good JOs !
Mike Smoot
@mes5k
Aug 05 2016 18:42
Hi Paolo, is there a variable set that has the work dir for a given task? What I'd like to be able to do is something like:
output:
file("results") into res_channel

script:
"""
export NECESSARY_ENV_VAR=${workdir}/results
dumb_command_that_needs_env_var > results
"""
Paolo Di Tommaso
@pditommaso
Aug 05 2016 18:44
um the task work dir or the pipeline work dir?
Mike Smoot
@mes5k
Aug 05 2016 19:21
Task work for
Dir
Paolo Di Tommaso
@pditommaso
Aug 05 2016 19:22
um, what's wrong with $PWD ?
make sure to escape it, thus \$PWD
Mike Smoot
@mes5k
Aug 05 2016 19:23
That's what I tried first and it gave me the dir containing main.nf
Maybe i needed to escape it
Paolo Di Tommaso
@pditommaso
Aug 05 2016 19:24
yes!
otherwise is resolved at groovy level
Mike Smoot
@mes5k
Aug 05 2016 19:25
Ok makes sense - many thanks!
Paolo Di Tommaso
@pditommaso
Aug 05 2016 19:25
welcome
Mike Smoot
@mes5k
Aug 05 2016 22:14

@pditommaso Hi Paolo, just following up about my comment on this issue: nextflow-io/nextflow#208 from last week. Here is a simple example where one process writes a YAML file that I then parse in a channel. My use case involved some old Python code that generated a Python object that I needed to use in a channel. This solution works pretty well for me and seems like it might be a simple pattern for passing complicated objects around Nextflow. Anyway, here is a simplified example:

@Grab(group='org.jyaml', module='jyaml', version='1.3')
import org.ho.yaml.Yaml

// There's just one input to the pipeline.
ch = Channel.from(10)

process dumpPythonObjectAsYAMLFile {

    input:
    set x from ch

    output:
    file("res.yaml") into yaml_result

    // This script takes the single input
    // and produces a single object that
    // contains a bunch of results.
    script:
    """
    #!/usr/bin/env python
    import yaml
    result = []
    for xx in range(1,${x}):
        # This object is simple, but could
        # be much more complicated.
        obj = {}
        obj['square'] = xx * xx
        obj['double'] = 2 * xx
        obj['triple'] = 3 * xx
        obj['orig'] = xx
        result.append(obj)

    with open('res.yaml','w') as out:
        out.write(yaml.dump(result))
    """
}

def yaml_to_list(yaml_file) {
    def l = []
    if (yaml_file != null) {
        try {
            l = Yaml.load(yaml_file.text)
        } catch (Exception e) {
            log.warn "${yaml_file} ${e.toString()}"
            l = []
        }
    }
    return l
}


yaml_result
    // I'm wondering whether it makes sense to create an operator that
    // generalizes this call such that arbitary YAML (or JSON) could
    // be parsed into a channel. This would allow more complicated data
    // structures to be passed around Nextflow.
    .flatMap{ yaml_to_list(it) }
    .map{ x -> [x.square, x.double, x.triple, x.orig] }
    .set{ individualResults }



process handleIndividualResults {
    input:
    set sq, db, trp, orig from individualResults

    output:
    stdout into finalResults

    script:
    """
    #!/usr/bin/env python
    if ${sq} == ${db}:
        print "HOORAY for ${orig}!"
    elif ${sq} == ${trp}:
        print "BOOO for ${orig}!"
    else:
        print "OK for ${orig}"
    """
}

finalResults.view()

I'm curious what you think or if there are other preferred approaches for this sort of thing.