These are chat archives for nextflow-io/nextflow

3rd
Jul 2017
Simone Baffelli
@baffelli
Jul 03 2017 07:35
Morning
Does "Fixed Extend each repeater syntax to support file collections #355" mean that it is not necessary to use combine anymore, If we want to operate on all combinations of some files?
can I also use multiple each in a single process?
Paolo Di Tommaso
@pditommaso
Jul 03 2017 07:47
yes, you can multiple each in a single process, however combine may still be required if you need to compose complex object eg. multiple tuples instead of values or files
Simone Baffelli
@baffelli
Jul 03 2017 07:48
Excellent! That means that each does not support collections yet?
Paolo Di Tommaso
@pditommaso
Jul 03 2017 07:49
exactly
Simone Baffelli
@baffelli
Jul 03 2017 07:49
:+1:
anyway, you did a great job with v0.25.0
:smile:
so many nice features
Paolo Di Tommaso
@pditommaso
Jul 03 2017 07:49
thanks for your suggestions ;)
Simone Baffelli
@baffelli
Jul 03 2017 07:52
Always happy to help. Nextflow made my life so much easier. Especially debugging is much easier...just need to cd into the workdir ;)
On the same not, since I'm a lazy person, I suggest a small change: would it be possible to include the full paths of the scripts in bin into .command.sh?
It would make debugging a step even easier
Paolo Di Tommaso
@pditommaso
Jul 03 2017 07:54
um, that would need to resolve all used tools and paths, too heavy
Simone Baffelli
@baffelli
Jul 03 2017 07:55
I see
Paolo Di Tommaso
@pditommaso
Jul 03 2017 07:55
however note the task environment is stored in the .command.env file
Simone Baffelli
@baffelli
Jul 03 2017 07:56
ah cool ;)
I just need to source it?
Paolo Di Tommaso
@pditommaso
Jul 03 2017 07:56
yep
@Hammarn I've uploaded version 0.25.2-SNAPSHOT, you may want to check if solve the problem you were reporting
Rickard Hammarén
@Hammarn
Jul 03 2017 08:00
Thanks! I'll try it out
Rickard Hammarén
@Hammarn
Jul 03 2017 08:39
@pditommaso It seems to work great. The correct error is displayed and the execution is halted
Paolo Di Tommaso
@pditommaso
Jul 03 2017 08:39
cool
Phil Ewels
@ewels
Jul 03 2017 09:43
:tada: :star2:
Phil Ewels
@ewels
Jul 03 2017 10:52
Hi @pditommaso - I have an optional process at the start of my pipeline (building a ref. index). If it doesn't run, then I create a dummy channel instead so that the downstream processes run. I tried doing this with makeBismarkIndex_stderr = Channel.create() but the downstream process then just hangs. Then I tried makeBismarkIndex_stderr = Channel.empty(), but the downstream processes don't run at all now (the pipeline exits cleanly).
Any pointers on how I should be doing this?
Generating the empty channel here (here if the optional process is running), to feed into a process here
Paolo Di Tommaso
@pditommaso
Jul 03 2017 11:00
yes, if you use Channel.create() the process get_software_versions will stop waiting for some data
otherwise when using Channel.empty() there's no data to process hence get_software_versions would be skipped
in your case you need a channel emitting a dummy value eg Channel.from('noversion')
Evan Floden
@evanfloden
Jul 03 2017 11:03
I’m often stuck in similar situations, wanting to create channels that may or may not be used and having the execution of downstream processes depentant on if the channel has contents or not. Probably just need to see some example to determine the best way.
Paolo Di Tommaso
@pditommaso
Jul 03 2017 11:06
maybe we could add operator emitting some values only if a condition is satisfied
Paolo Di Tommaso
@pditommaso
Jul 03 2017 11:11
for example
Channel.from(some,values).onlyIf( condition )
Maxime Garcia
@MaxUlysse
Jul 03 2017 11:27
That could be interesting ;-)
Evan Floden
@evanfloden
Jul 03 2017 11:39

I think I am also struggling with the downstream of this. Basically which processes should run being dependent on channel contents. Best to illustrate with an example.

Imagine I want four modes that the workflow can operate in. Either:

  • Mode A: Align using sequences from —seqs.
    Only Align process should run.

  • Mode B: Align and score using sequences from —seqs and references from —refs.
    Both align and score processes should run.

  • Mode C: Align using custom trees using sequences from —seqs and trees from —trees.
    Both align and tree processes should run.

  • Mode D: Align using custom trees and score using sequences from —seqs and trees from —trees and references from —refs.
    All processes should run.

I’m playing around with the best way to do this. Beyond this, using empty channels limits me to a single mode per NF run. The ideal solution would allow me to have some sets of data run through one modes and others through other modes dependant on which files are provided for a particular dataset. I need to play around a bit more to figure it out.

Phil Ewels
@ewels
Jul 03 2017 11:47
Thanks @pditommaso! I had to do Channel.from(false) and wrap the downstream bit in an if statement in the end, as it expected that channel to emit a file instead of a string. But the principle worked :+1:
The onlyIf thing wouldn't help the downstream processes which are sitting waiting for an input though, right?
I think what I'd like more is an optional process input instead of an optional channel. But not sure if / how that would work (it probably wouldn't).
I'm not too unhappy with the current setup of creating empty channels conditionally tbh
Paolo Di Tommaso
@pditommaso
Jul 03 2017 11:50
yes, in general we would like to have a better way to handle conditional process definition also along what @skptic is suggesting
Mike Smoot
@mes5k
Jul 03 2017 12:41
:+1: for onlyIf! This would clean up a few of our pipelines where we optionally create empty channels depending on input params
Félix C. Morency
@fmorency
Jul 03 2017 13:43

@pditommaso I read that optional can only be used on file. I tried adding a file such as

set sid, "foo" into bar
file "foo" optional true

but NF still complains that the foo output file is missing. How can I achieve this?

Paolo Di Tommaso
@pditommaso
Jul 03 2017 13:45
process foo {
output: 
file "foo" optional true

'''
echo x > foox
'''
}
Félix C. Morency
@fmorency
Jul 03 2017 13:45
But I need my set :(
Paolo Di Tommaso
@pditommaso
Jul 03 2017 13:46
oh, I'm sorry is not supported with set
what value value would you given for that file if does not exist ?
Félix C. Morency
@fmorency
Jul 03 2017 13:49
just stop the pipeline there (ie. stop that branch)
Paolo Di Tommaso
@pditommaso
Jul 03 2017 13:51
do you mean not producing any output ?
Félix C. Morency
@fmorency
Jul 03 2017 13:51
yes
Paolo Di Tommaso
@pditommaso
Jul 03 2017 13:52
I see, you can do the (almost) same setting errorStrategy 'ignore'
Félix C. Morency
@fmorency
Jul 03 2017 13:53
mm that's too invasive. I want real errors to still pop
Paolo Di Tommaso
@pditommaso
Jul 03 2017 13:53
yes, you are right
Félix C. Morency
@fmorency
Jul 03 2017 13:53
The process executing before foo is pruning some data and sometime, the pruning can produce some empty results
Paolo Di Tommaso
@pditommaso
Jul 03 2017 13:53
the only workaround for now is to touch an empty file and then filtering that tuple out
Félix C. Morency
@fmorency
Jul 03 2017 13:53
I see.
Thanks
Paolo Di Tommaso
@pditommaso
Jul 03 2017 13:59
I will give a try to the optional for set
Félix C. Morency
@fmorency
Jul 03 2017 14:03
\o/
that would be awesome. want me to log an issue?
Paolo Di Tommaso
@pditommaso
Jul 03 2017 14:04
yes please
Félix C. Morency
@fmorency
Jul 03 2017 14:15
done
#399
Paolo Di Tommaso
@pditommaso
Jul 03 2017 14:15
:+1:
Félix C. Morency
@fmorency
Jul 03 2017 19:08
This message was deleted