## Where communities thrive

• Join over 1.5M+ people
• Join over 100K+ communities
• Free without limits
##### Activity
• Oct 19 15:59
tbugfinder commented #1235
• Oct 19 15:52
pditommaso commented #1046
• Oct 19 15:52
pditommaso closed #1046
• Oct 19 15:52
pditommaso commented #1046
• Oct 19 15:28
pditommaso commented #1298
• Oct 19 15:27
pditommaso commented #1298
• Oct 19 15:26
lucacozzuto commented #1298
• Oct 19 15:26
pditommaso closed #1298
• Oct 19 15:19
pditommaso commented #1235
• Oct 19 15:16
pditommaso commented #1298
• Oct 19 15:09
lucacozzuto commented #1298
• Oct 19 15:08
lucacozzuto commented #1298
• Oct 19 15:06
tbugfinder commented #1235
• Oct 19 15:04
pditommaso labeled #1338
• Oct 19 15:02
tbugfinder commented #1339
• Oct 19 14:54
pditommaso commented #1298
• Oct 19 14:53
pditommaso closed #1263
• Oct 19 14:53
pditommaso commented #1263
• Oct 19 14:46
pditommaso closed #1295
• Oct 19 14:46
pditommaso commented #1295
Stijn van Dongen
@micans
@marchoeppner I don't see anything more elegant; to solve the empty channels I can only think of using mix() combined with groupTuple(), e.g.
a = Channel.from(['a', 1], ['b', 2])
b = Channel.from(['a', 3], ['b', 4])
c = Channel.empty()

a.mix(b).mix(c).groupTuple().view()
marchoeppner
@marchoeppner
ok thanks, I will try that!
Stijn van Dongen
@micans
@marchoeppner note that groupTuple() will block until the channel has completed. In your case, you may know the size of each eventual tuple (as it is the number of analyses), so you could give it the size parameter, in that case a tuple is released once it has that size.
Stijn van Dongen
@micans
@marchoeppner further caveat; the tuples you get can have the analyses in different orders. If the elements are files I imagine it does not matter much.
marchoeppner
@marchoeppner
that might actually be a problem, since I need to do something like:
input: set val(sample_id),file(analysis1),file(analysis2),file(analysis3) from Foo
so it still is a bit tricky I reckon..
it's not my pipeline, so I don't have too much control over the basic logic of it all, just trying to implement support for multiple input data sets (right now it assumes that there is only one sample)
Stijn van Dongen
@micans

How is this

input: set val(sample_id),file(analysis1),file(analysis2),file(analysis3) from Foo

going to work if some analysis could be missing? As for the order that could be fixed I think with an additional sorting step.

marchoeppner
@marchoeppner
indeed ^^
it used to be multiple arguments under "input:" with an added ".ifEmpty('')" - but that seems difficult to do now
well the empty channel needs to emit something at least, then one could try to verify each element to see if it is an actual file or just a placeholder , like "''" or NULL
Stijn van Dongen
@micans
yes, that would be a way of structuring the program. It would not be an empty channel, it would emit dummy values.
marchoeppner
@marchoeppner
problem is that this will require much more substantial changes, since we want to join/mix based on a key - so it would have to be [ some_key, NULL ] or something along those lines :D I think the whole pipeline needs to be set up in a different way.... maybe have the reporting step be like MultiQC so that it automatically detects which outputs are present or whatever
and just skip the need for dummy values...
Stijn van Dongen
@micans
I've experimented a bit .... with a setup like the following, you could perhaps stick some intelligence in the script section to detect what it has?
a = Channel.from(['a', 1], ['b', 2])
b = Channel.from(['a', 3], ['b', 4])
c = Channel.from(['a', 5], ['b', 6])
d = Channel.empty()

a.mix(b).mix(c).mix(d).groupTuple().view().set { ch }

process bar {
input: set val(a), val(b) from ch
echo true

shell:
'''
echo "one value !{a} other values !{b}"
'''
}
Stijn van Dongen
@micans
If the input to the script are files, it could consider file suffixes/infixes. Or the pipeline could notify the script via a different channel.
Stijn van Dongen
@micans
(edit to poke @marchoeppner)
Combiz
any ideas why a function from a package in R would not be found when run in singularity via nextflow? e.g. could not find function "sumCountsAcrossCells"
and even calling the function with scater::sumCountsAcrossCells gives "Error: 'sumCountsAcrossCells' is not an exported object from 'namespace:scater'
Combiz
ah nevermind, realised it's a singularity image troubleshooting issue, thanks
David Mas-Ponte
@davidmasp
Hi, stupid groovy question, sorry i it does not make sense.
I am trying to use a parameter from params inside a .map call. Is that possible?

This is what I want to do

bed_split_ch = bed_ch
.map{it -> tuple(it[0],it[1].splitText(by: 2, file: true))}

and it works. I would like to use params.splitBed instead of 2.

David Mas-Ponte
@davidmasp
okay, i am stupid. It does work as expected... I was running the console and it does not reset the params every executions (could it be?) sorry for bothering
Paolo Di Tommaso
@pditommaso
yes, the console can have some tricky behavior because it resume the same session, there should a command to clear it in the menu somewhere ..
mmatthews06
@mmatthews06
@davidmasp , you're talking about the online Groovy console, or something else?
David Mas-Ponte
@davidmasp
mmatthews06
@mmatthews06
:thumbsup:
Matthieu Pichaud
I am running nextflow on AWS using BATCH.
The spinned-up instance remains when the workflow is complete.
Is it the expected behavior?
Paolo Di Tommaso
@pditommaso
Instances should be teared down after a while
Stephen Kelly
@stevekm
hey I also posted this in the Google groups, but I am trying to come up with methods to implement unit testing and CI for Nextflow pipelines. I got as far as doing unittest in Python, but I am not sure what exactly I should be testing. Any suggestions? My work so far is here: https://github.com/stevekm/nextflow-ci
also helpful I guess in that I also started a super basic module to run Nextflow from Python, its in there under nextflow.py, mostly just a CLI wrapper with ENV variables thrown in
Luca Cozzuto
@lucacozzuto
Hi all, I have a process that should be performed only when a parameter is a certain value. So I use the "when" condition. The problem I have is that the script is complaining for the absence of the input channel in that process even if the condition is not met... If I make an empty channel then it hangs without doing nothing... is this the normal behaviour?
Stijn van Dongen
@micans
@lucacozzuto I use when a lot exactly as you describe, and it works for me. Can you make an example illustrating the point?
Luca Cozzuto
@lucacozzuto
do you use Channel.create() for the empty channel? I think I found the solution using Channel.empty()
Stijn van Dongen
@micans
There is many different ways; I don't need create() as the source channel is always active, when is only used in a downstream channel. If you want to have a source channel active/inactive depending on a parameter then I think emtpy() makes sense indeed.
Luca Cozzuto
@lucacozzuto
ok! :)
Ólavur Mortensen
@olavurmortensen
How can I convert a channel object into a normal Groovy object? For example, I have a process that emits a directory, and I need this directory as a string to do some regex on the filenames in the folder. See example below.
[Process that emits a folder to "process1_ch"]

// Get the path in process1_ch
path = ?????????

// Do something with the path
new_ch = Channel.fromPath(path + '*somefilepatternmatching')
.map { do something with the data }

[Process that uses "new_ch"]
I run in to these kinds of issues all the time, and end up using some workaround. Surely there's some way to do this.
you mean you want to use that channel inside a process ? @olavurmortensen
Ólavur Mortensen
@olavurmortensen
Do you mean process1_ch or new_ch?
oh sorry, you want to process the channel using groovy code ?
new_ch
Ólavur Mortensen
@olavurmortensen
yeah exactly
could you please share what are u trying to achieve with groovy code ? maybe there is a nextflow function that is already there
Ólavur Mortensen
@olavurmortensen
To me, an obvious solution seems to be toList(https://www.nextflow.io/docs/latest/operator.html#tolist), but that doesn't seem to do the job
since nextflow is based on groovy code
Ólavur Mortensen
@olavurmortensen
Here's the actual code:
// Get the sample names from the files in the output FASTQ folder by regex.
sample_names_ch = Channel.fromPath(bcl2fastq_outdir + '/*.fastq.gz')
.map { it.toString() }
.map { it.replaceAll(/\/.*\//, "") }  // Remove everything leading up to the sample name.
.map { it.replaceAll(/_.*/, "") }  // Remove everything after the sample name.
.unique()
Where bcl2fastq_outdir is a directory