These are chat archives for nextflow-io/nextflow

20th
Jun 2017
LukeGoodsell
@LukeGoodsell
Jun 20 2017 10:39
Hi. Is there a way to access an input file when using an exec block? I’ve seen #22; is there a way to find out the directory that the input file came from? Here’s a quick demo script: https://gist.github.com/LukeGoodsell/7db6765adbe7309f4570e5f24158c0f5
Paolo Di Tommaso
@pditommaso
Jun 20 2017 10:46
fooFile.exists() does not work?
LukeGoodsell
@LukeGoodsell
Jun 20 2017 10:46
It does, but it’s false
the .toFile() makes no difference
Paolo Di Tommaso
@pditommaso
Jun 20 2017 10:47
Need to check
LukeGoodsell
@LukeGoodsell
Jun 20 2017 10:49
(tested with NXF_VER 0.24.4 and 0.25.0-RC4)
LukeGoodsell
@LukeGoodsell
Jun 20 2017 11:00
OK, if I declare the input file as a val instead of a file, it has the absolute path and is accessible
It’s an unintuitive quirk, but I’m happy I have a solution
Paolo Di Tommaso
@pditommaso
Jun 20 2017 11:06
Yesz the bottom line is that with exec you don't need to stage the input files in the local work for, thus file is not needed
Tho I agree that is counterintuitive, could you please open an issue with that example?
LukeGoodsell
@LukeGoodsell
Jun 20 2017 11:35
Sure: #378
Paolo Di Tommaso
@pditommaso
Jun 20 2017 11:53
Great, thanks
Simone Baffelli
@baffelli
Jun 20 2017 11:58
Hello
Quick question cause I'm a lazy guy today: whats the best approach to recursively process a file?
Paolo Di Tommaso
@pditommaso
Jun 20 2017 11:59
um, depends what do you mean
A file content ?
Simone Baffelli
@baffelli
Jun 20 2017 11:59
I need to repeatedly perform an operation that produces the same file taking the same file as input
and adding stuff to it
but the file is not a textfile
so i cant just use collecttofile
or whatever that function is called
data, file --> samefile with data added
Paolo Di Tommaso
@pditommaso
Jun 20 2017 12:00
just a single file or many files
Simone Baffelli
@baffelli
Jun 20 2017 12:01
every time i get a data it should be added to the file
think of it as a sort of db
but to add stuff to it i first need to pass it to the process
Paolo Di Tommaso
@pditommaso
Jun 20 2017 12:02
do have some tool to add the data or you will need to write a custom function ?
Simone Baffelli
@baffelli
Jun 20 2017 12:02
I have a tool
that takes file and data
and adds stuff to file in place
Paolo Di Tommaso
@pditommaso
Jun 20 2017 12:04
a process should work, no?
Simone Baffelli
@baffelli
Jun 20 2017 12:04
yes, but how to pass the initial file to the process?
maybe creating it in a separate process?
Paolo Di Tommaso
@pditommaso
Jun 20 2017 12:05
a channel emitting the initial file, that concat with the other
Simone Baffelli
@baffelli
Jun 20 2017 12:05
right!
I can't think straight today :dizzy_face:
Paolo Di Tommaso
@pditommaso
Jun 20 2017 12:06
good luck :)
Simone Baffelli
@baffelli
Jun 20 2017 12:09
thanks :smile:
I need more rest probabily
Paolo Di Tommaso
@pditommaso
Jun 20 2017 13:01
holidays are approaching :)
LukeGoodsell
@LukeGoodsell
Jun 20 2017 13:08
Hi again. Is there a good way in Nextflow to run each item in a list of lists of items through a process and then recombine the output to the original nested structure? The best way I can see involves adding a key to each input list, flattening it, and then using groupTuple to recombine, like so:
https://gist.github.com/LukeGoodsell/0352b1e626ddea721474f673bb2173e5
However, this loses the ordering of the items within the original lists, and involves more boilerplate than is ideal.
Paolo Di Tommaso
@pditommaso
Jun 20 2017 13:15
process execution is inherently parallel, so there's not way to guarantee the order unless you sort the resulting channel content
I'm not understanding this syntax
flatMap { it -> L:{ groupIdx++; it.collect { [ groupIdx, it ] } } }
LukeGoodsell
@LukeGoodsell
Jun 20 2017 13:18

It turns [ [1, 2, 3, 4], [5, 6, 7], [8] ] into

[1, 1]
[1, 2]
[1, 3]
[1, 4]
[2, 5]
[2, 6]
[2, 7]
[3, 8]

I.e.: each item is put as the second element of an array, where the first element is the group index from which it was taken

Paolo Di Tommaso
@pditommaso
Jun 20 2017 13:19
I've never seen the syntax L:{ ... }
LukeGoodsell
@LukeGoodsell
Jun 20 2017 13:19
I’m not especially worried about the ordering, but I thought there might be a better way
It forces interpretation as a code block rather than as a closure
Paolo Di Tommaso
@pditommaso
Jun 20 2017 13:20
oh
LukeGoodsell
@LukeGoodsell
Jun 20 2017 13:21
Without it, I got:
ERROR ~ Ambiguous expression could be either a parameterless closure expression or an isolated open code block;
   solution: Add an explicit closure parameter list, e.g. {it -> ...}, or force it to be treated as an open block by giving it a label, e.g. L:{...} @ line 10, column 20.
           .flatMap { it -> { groupIdx++; [ [ groupIdx, it ] ] } }
                      ^
Paolo Di Tommaso
@pditommaso
Jun 20 2017 13:22
I see, L: is a label definition
but still I don't the advantage of
-> L:{ groupIdx++; it.collect { [ groupIdx, it ] } }
vs
-> groupIdx++; it.collect { [ groupIdx, it ] }
LukeGoodsell
@LukeGoodsell
Jun 20 2017 13:24
Hmm, I thought I tried thatand it failed. Just tried it again and it didn’t.
Incidentally, I’ve prepared a version that preserves ordering within the nested lists (though not the outer list):
https://gist.github.com/LukeGoodsell/0352b1e626ddea721474f673bb2173e5#file-listoflistsofitemsorderpreserving-nf