These are chat archives for nextflow-io/nextflow

11th
Aug 2017
Simone Baffelli
@baffelli
Aug 11 2017 08:39
Morning! Is there any accepted best practice to cross or phase three channels instead of two?
Paolo Di Tommaso
@pditommaso
Aug 11 2017 08:40
chaining them ?
Simone Baffelli
@baffelli
Aug 11 2017 08:41
but then I cannto specify the same key
because the output of the first cross consists of list, while the third channel does nto
Paolo Di Tommaso
@pditommaso
Aug 11 2017 08:43
maybe combine is a better alternative to cross ?
Simone Baffelli
@baffelli
Aug 11 2017 08:44
It could be..or maybe I'm just using the wrong logic
I have tuples consisting of [data, timestamp, mehod] and I would like to collect them into lists of [data1,...datan],[timestamp1,...timestampn],method] by method
Because I want to apply the some processing to each subsequence for each method
I've been struggling with it for days now
Paolo Di Tommaso
@pditommaso
Aug 11 2017 08:46
so you want only the first component ?
Simone Baffelli
@baffelli
Aug 11 2017 08:48
I need to collect the same number of elements with the same order for each method
and to collect I must use buffer to ensure that the difference between timestamp1 and timestampn does not exceed a certain duration
Paolo Di Tommaso
@pditommaso
Aug 11 2017 08:50
I think you should be able to do that with
  1. mix all the channels together
  2. map to get only the first component
  3. collate to slice them (or buffer ..)
Simone Baffelli
@baffelli
Aug 11 2017 08:50
I tried something similar
but I want to keep the method and get the same number of slices for each method
with the same timestamps, to ensure that I have representative statistics
Paolo Di Tommaso
@pditommaso
Aug 11 2017 08:52
umm, if so maybe you need to groupTuple by the method and then .. I don't know :)
prototype your problem by using nextflow console
Simone Baffelli
@baffelli
Aug 11 2017 08:53
Yes, probabily groupTuple is the way to go
otherwise I will apply the buffering to each channel separately
and then mix then
although I don't find the solution very elegant
Paolo Di Tommaso
@pditommaso
Aug 11 2017 08:53
feel free to propose a new op
Simone Baffelli
@baffelli
Aug 11 2017 08:54
Well then I think a n-way mixing op will be quite useful
sorry phasing
damn timeseries, they make everything harder because they require I pay attention to ordering eveytime I want to collect stuff :scream:
Paolo Di Tommaso
@pditommaso
Aug 11 2017 08:55
mix already supports many channels eg source.mix(a,b,c,..n)
ah
I'm rewriting phase to make it more simply to use
Simone Baffelli
@baffelli
Aug 11 2017 08:56
Yes, mix does that but phase does not
Paolo Di Tommaso
@pditommaso
Aug 11 2017 08:57
the new one will be easier to chain as you are suggesting
Simone Baffelli
@baffelli
Aug 11 2017 08:57
:plus1:
very good news
Paolo Di Tommaso
@pditommaso
Aug 11 2017 08:57
not sure how to call it, join, pair, pairTuple, .. bo?
Simone Baffelli
@baffelli
Aug 11 2017 08:59
synchronize?
Paolo Di Tommaso
@pditommaso
Aug 11 2017 08:59
java keyword ! :)
Simone Baffelli
@baffelli
Aug 11 2017 08:59
?
Paolo Di Tommaso
@pditommaso
Aug 11 2017 09:00
Simone Baffelli
@baffelli
Aug 11 2017 09:02
Oh
sync
Paolo Di Tommaso
@pditommaso
Aug 11 2017 09:04
also sync reminds too much synchronisation and critical section in concurrent programming, I would like to avoid
Simone Baffelli
@baffelli
Aug 11 2017 09:06
Right
Simone Baffelli
@baffelli
Aug 11 2017 09:15
I also like phase because it immediately emits phased tuples
well I think what I wanted to to can be achieved with groupTuple quite easily
Paolo Di Tommaso
@pditommaso
Aug 11 2017 09:19
:ok_hand:
Simone Baffelli
@baffelli
Aug 11 2017 09:19
:point_down:
you know the game right? :grimacing:
Paolo Di Tommaso
@pditommaso
Aug 11 2017 09:20
what's up ! :)
Simone Baffelli
@baffelli
Aug 11 2017 09:20
Don't you know the stupid game with this gesture :ok_hand: ?
Paolo Di Tommaso
@pditommaso
Aug 11 2017 09:20
ahahah
Francesco Strozzi
@fstrozzi
Aug 11 2017 09:25
oh god, reminds me of high school this.
:smile:
Simone Baffelli
@baffelli
Aug 11 2017 09:27
here's where you recognize the italian useres :grinning:
Paolo Di Tommaso
@pditommaso
Aug 11 2017 09:28
OMG, gestures also in the chat!
Francesco Strozzi
@fstrozzi
Aug 11 2017 09:30
:+1:
Simone Baffelli
@baffelli
Aug 11 2017 09:31
:clap:
Shellfishgene
@Shellfishgene
Aug 11 2017 14:41
I have multiple samples, and two sets of paired files for each from different channels. How do I make sure a process that gets input from both channels combines the right file pairs, so both for the same sample?
For example:
input:
    set sample_id, file(files) from fastq_files
    set sample_id2, file(files2) from other_fastq_files
How do I make sure sample_id is the same as sample_id2?
Paolo Di Tommaso
@pditommaso
Aug 11 2017 14:55
the order of the channels is guaranteed, hence they will be received in the same order as they are sent
Francesco Strozzi
@fstrozzi
Aug 11 2017 15:07
the S3 file system used in NF is this one ? https://github.com/Upplication/Amazon-S3-FileSystem-NIO2
Paolo Di Tommaso
@pditommaso
Aug 11 2017 15:09
NF uses a fork of that project https://github.com/nextflow-io/nextflow-s3fs
Francesco Strozzi
@fstrozzi
Aug 11 2017 15:10
thx
Shellfishgene
@Shellfishgene
Aug 11 2017 15:14
@pditommaso ok, thanks
Mike Smoot
@mes5k
Aug 11 2017 16:23
Hi @pditommaso, would it be possible to add the queue process directive as option for the trace file output? I'd like to know which queue a process ran on to help track costs.
Paolo Di Tommaso
@pditommaso
Aug 11 2017 16:30
it makes sense, please open a issue on GH
Mike Smoot
@mes5k
Aug 11 2017 16:33
Great, will do
If you can point me in the right direction I might be able submit a patch to go with that issue!
Paolo Di Tommaso
@pditommaso
Aug 11 2017 17:02
Nice, I will comment on GH