These are chat archives for nextflow-io/nextflow

18th
Apr 2017
Tim Diels
@timdiels
Apr 18 2017 00:13
Note I could probably work around it with cancel() { ssid cleanup.sh }
Tim Diels
@timdiels
Apr 18 2017 00:26
*setsid
Paolo Di Tommaso
@pditommaso
Apr 18 2017 09:10
NF kills processes with a SIGTERM, you are trapping SIGINT
Tim Diels
@timdiels
Apr 18 2017 09:36
Managed to trigger the issue with SIGTERM: main.nf: https://pastebin.com/u42t8PW4 Output: https://pastebin.com/kRGyuQtT
Paolo Di Tommaso
@pditommaso
Apr 18 2017 09:37
not sure to understand what is the problem
if this is an issue, please open an issue on GitHub
describing your use case, what you are expecting and what's the problem
Tim Diels
@timdiels
Apr 18 2017 09:39
The sleep is still running when nextflow exited and the handler did not wait for it to complete. But seeing as hi appears twice maybe the trap handler got interrupted by a second sigterm
Tim Diels
@timdiels
Apr 18 2017 09:44
I'll open an issue
Paolo Di Tommaso
@pditommaso
Apr 18 2017 14:08
@fmorency @dctrud Quick question, do you know how to reserve local storage in a SLURM sbatch job ?
Félix C. Morency
@fmorency
Apr 18 2017 15:24
@pditommaso nope sorry
Tim Diels
@timdiels
Apr 18 2017 15:41
How would you avoid typing .merge(channels) {x1,x2,x3,... -> [x1,x2,x3,...]}? I tried .merge(channels) {Object... args -> args} but this results in ERROR ~ The operator's body accepts 1 parameters while it is given 3 input streams. The numbers must match.
The goal is to make a merge that returns a list without needing a closure arg
Paolo Di Tommaso
@pditommaso
Apr 18 2017 15:47
you can't
Tim Diels
@timdiels
Apr 18 2017 16:18
I guess that's why it isn't optional
Paolo Di Tommaso
@pditommaso
Apr 18 2017 16:18
I guess the same :)
Tim Diels
@timdiels
Apr 18 2017 17:42
My cache gets invalidated too often. Here are 3 runs https://pastebin.com/7Y9Jc11P
In the second run I added a tap, so the channel is unaffected, yet allVsAllProteomes are invalidated. When I run again without touching anything, it does cache as expected. Here's the code at the line where I added a tap https://gitlab.psb.ugent.be/deep_genome/pipeline/blob/nextflow/main.nf#L229
I also print input, in the first 2 runs (and probably the third) the input to allVsAllProteomes is always the same, so why does it rerun?
Félix C. Morency
@fmorency
Apr 18 2017 17:47
@timdiels did you try do dump the hashes and see where it diff?
Tim Diels
@timdiels
Apr 18 2017 17:50
@fmorency How do I dump them? Normally the database file is unchanged (same work directory)
Félix C. Morency
@fmorency
Apr 18 2017 17:51
@timdiels nextflow run -dump-hashes ...
Tim Diels
@timdiels
Apr 18 2017 18:32
@fmorency I see, the root directory of my pipeline is part of the process' hash. Why would that be there? Is it because of the $workflow.projectDir reference in the script?
Félix C. Morency
@fmorency
Apr 18 2017 18:38
Why not have it? :)
Tim Diels
@timdiels
Apr 18 2017 18:49

@fmorency hehe, it's way too broad as it contains my main.nf. So each time my main.nf changes, bam, the cache gets invalidated. It's good to depend on $workflow.projectDir/resources/params, but not workflow.projectDir itself. I found a way around it though

params = "$workflow.projectDir/resources/params"
process allVsAll {
    """
    ... $params ...
    """
}

Setting that intermediate variable outside script instead of inlining it does the trick.

Félix C. Morency
@fmorency
Apr 18 2017 18:50
Well it makes sense that each time the main.nf changes, the cache gets invalidated. Otherwise, it can leads to an undefined state.
In other words, how do you make sure the data produced by your workflow will be the same after your changes?
Michael L Heuer
@heuermh
Apr 18 2017 19:39
Along the same lines, it might be preferable to always run from a git repository to that changes to main.nf are tracked.
Tim Diels
@timdiels
Apr 18 2017 19:48
@fmorency Doesn't a resume rerun everything except processes, where processes are ran like process.runUnlessCached(input) (pseudocode)? If the input to a process is the same and the code of the process hasn't changed, surely you can expect it to yield the same output, if not it should set cache: false
In the framework I previously used, I had a version field to indicate a 'process' changed and should be rerun
Félix C. Morency
@fmorency
Apr 18 2017 19:57
Don't forget the parameters. But yes.
Tim Diels
@timdiels
Apr 18 2017 20:01
Ok, perfect, that's the behaviour I want :)