These are chat archives for nextflow-io/nextflow

5th
Oct 2017
Simone Baffelli
@baffelli
Oct 05 2017 07:33
Good morning. A very detailed question today: suppose I have a process taking a set input consisting of a pair of lists ([file1,...,fileN],[datei,...,dateN])([file_1,...,file_N],[date_i,...,date_N]). Does changing the order of elements inside of these lists invalidate the process cache or is the process hashing the content of each list individually regardless of order?
Evan Floden
@evanfloden
Oct 05 2017 07:39
Fancy formating! How?
Simone Baffelli
@baffelli
Oct 05 2017 07:40
Just use \$$
around your LaTeX code
Disregard the backslash
Evan Floden
@evanfloden
Oct 05 2017 07:41
Cool, thanks!
Simone Baffelli
@baffelli
Oct 05 2017 07:41
verycoolindeedvery_{cool}^{indeed}
Paolo Di Tommaso
@pditommaso
Oct 05 2017 08:21
it depends for collections of files returned .collect the order is not taken into account
other collections yes
Simone Baffelli
@baffelli
Oct 05 2017 08:56
I'm using .collect()followed by .map{it.transpose()}
Paolo Di Tommaso
@pditommaso
Oct 05 2017 09:02
you mean .collect().map { .. }
Simone Baffelli
@baffelli
Oct 05 2017 09:04
yes
and it does depend on the order
I just tested it
Paolo Di Tommaso
@pditommaso
Oct 05 2017 09:05
if so the order matter
Simone Baffelli
@baffelli
Oct 05 2017 09:05
This is the process that needs these files:
process separableSpatioTemporalVariogram{

    input:
        set file(unwGridCollected:"unwGrid*.csv"), val(masterIds), val(slaveIds) from collectedUnwGridsForSeparableVG
        // file(collectedGrids) from collectedUnwGridsForSeparableVG
        // file(firstGrid) from firstFitGridForSpatioTemporalVariogram
        val(trendFormula) from params.best_trend
    output:
        set file(temporalPlot), file(spatialPlot) into variogramPlots
        set file(temporalModel), file(spatialModel), val(masterIds), val(slaveIds) into variogramModelFits
    shell:
        '''
        ##Extract header
        head -2 unwGrid1.csv > all.txt; tail -n +2 -q unwGrid*.csv >> all.txt
        #Compute variogram fits
        spatio_temporal_variogram.R all.txt "!{trendFormula}" temporalPlot temporalModel spatialPlot spatialModel
        '''
}
Perhaps it can be solved with cache: deep
Paolo Di Tommaso
@pditommaso
Oct 05 2017 09:13
I think won't work
Simone Baffelli
@baffelli
Oct 05 2017 09:13
:cry:
nextflow needs a custom caching feature :grin:
Paolo Di Tommaso
@pditommaso
Oct 05 2017 09:13
you could sort the collection to keep it in the same order
Simone Baffelli
@baffelli
Oct 05 2017 09:14
That was what I was thinking
Paolo Di Tommaso
@pditommaso
Oct 05 2017 09:14
you can do
.collect().map{..}.sort { .. }
Simone Baffelli
@baffelli
Oct 05 2017 09:15
yeah, e.g sort them by date
I still think a custom caching operator would be great
if too complicated for many applications
Paolo Di Tommaso
@pditommaso
Oct 05 2017 09:15
new feature and pull request are (almost) always welcome :)
Simone Baffelli
@baffelli
Oct 05 2017 09:16
If only I would have time !
I was also considering a version of buffer that exposes the current buffer in the closure
for my weird type of application that would be very useful
Paolo Di Tommaso
@pditommaso
Oct 05 2017 09:17
something I would like to have a 36h long day
Simone Baffelli
@baffelli
Oct 05 2017 09:17
Don't tell me
Paolo Di Tommaso
@pditommaso
Oct 05 2017 09:17
or maybe I should spend less time on gitter
Simone Baffelli
@baffelli
Oct 05 2017 09:18
:laughing:
For sure not having a private life/hobbies helps. But I don't want to live such a life
Paolo Di Tommaso
@pditommaso
Oct 05 2017 09:20
:laughing:
Luca Cozzuto
@lucacozzuto
Oct 05 2017 09:21
or maybe CRG should start thinking to give you some more developer :)
Simone Baffelli
@baffelli
Oct 05 2017 09:22
that would be a great idea
Luca Cozzuto
@lucacozzuto
Oct 05 2017 09:22
we should start a petition
Paolo Di Tommaso
@pditommaso
Oct 05 2017 09:23
that's a good point, core facilities could be a natural sponsor :)
Simone Baffelli
@baffelli
Oct 05 2017 09:25
Otherwise you could sell your soul and make nextflow into a commercial product :japanese_goblin:
Paolo Di Tommaso
@pditommaso
Oct 05 2017 09:27
sooooo complicated
Simone Baffelli
@baffelli
Oct 05 2017 09:27
ofc I'm not being serious.
Simone Baffelli
@baffelli
Oct 05 2017 10:17
sorting the collections does not seem to make a difference
Paolo Di Tommaso
@pditommaso
Oct 05 2017 10:17
not possible
are you sure you are sorting correctly ?
Simone Baffelli
@baffelli
Oct 05 2017 10:18
.map{it->it.transpose().sort{}}
Paolo Di Tommaso
@pditommaso
Oct 05 2017 10:18
ummmm
Simone Baffelli
@baffelli
Oct 05 2017 10:18
should suffice, according to a small test I did with groovyConsole
or prehaps I must give a sorting closrure?
Paolo Di Tommaso
@pditommaso
Oct 05 2017 10:19
you may, depend the structure of your list
Simone Baffelli
@baffelli
Oct 05 2017 10:20
it may be worth first sorting and then transposing
ensuring that everything is sorted by date
the fact that I';m dealing with timeseries make it all a bit more complicated
Mike Smoot
@mes5k
Oct 05 2017 20:20

Hi @pditommaso, I'm trying to decipher some output from .nextflow.log. Can you help me interpret this:

[process] buildBreakPlotRDataImage
  status=ACTIVE
  port 0: (queue) OPEN; channel: -
  port 1: (value) -   ; channel: edata_file
  port 2: (cntrl) OPEN; channel: $

For port 0, channel: - simply means that the first channel generates a set so you don't print the variable name, correct? I'm also unsure what the - means in port 1: (value) -? It's not OPEN or CLOSED, so I'm not quite sure what it means.

Paolo Di Tommaso
@pditommaso
Oct 05 2017 20:21
:D
you requested this feature ..
it means that a value channel, hence it cannot be open nor close
Mike Smoot
@mes5k
Oct 05 2017 20:22
I know and it's awesome!
I just don't quite understand what it's telling me.
Paolo Di Tommaso
@pditommaso
Oct 05 2017 20:22
in other words it cannot hang the pipeline execution
Mike Smoot
@mes5k
Oct 05 2017 20:23
Got it, that makes perfect sense.
Now back to figuring out why I have 10 active nodes.... sigh
Paolo Di Tommaso
@pditommaso
Oct 05 2017 20:24
well, once there's one, all downstream process can remain active
Mike Smoot
@mes5k
Oct 05 2017 20:25
That's very helpful!