These are chat archives for nextflow-io/nextflow

21st
Feb 2019
Daniel E Cook
@danielecook
Feb 21 10:40

Curious how people handle large parameter spaces when running pipelines. One way is to pass the different values to processes:

// x=[1,2]; y=[a,b], z=[alpha,beta,zeta]
input:
    set val(x), val(y), val(z) from Channel.(many_combinations)

But passing a large number of variables around can become cumbersome if you want to look at 5 or 6 different variables. So what about using a LinkedHashMap to store the variables, pass that around, and then use a function that converts the hashmap to a slug (e.g. x-1,y-a,z-alpha) which can be used when outputting files.

Has anyone experimented with this sort of approach and are there any potential issues with doing so?

make sure to use the map as an immutable collection
Daniel E Cook
@danielecook
Feb 21 10:46
Thanks! I was having issues with caching...
ah ok
Daniel E Cook
@danielecook
Feb 21 10:52
So you can't - in a process modify the value of the 'immutable' collection - you have to define it in its entirety before running it in processes or you will run into cache issues is that correct?
For example, under the script directive:
        row.param_set['use_m'] = use_m == "-M" ? 'T' : 'F'
        row.param_set['clean_sam'] = use_clean_sam ? 'T' : 'F'
This is a linkedhashmap btw
Paolo Di Tommaso
@pditommaso
Feb 21 10:53
make a copy and then modify the new one
row = new HashMap(row)
row.param_set['use_m'] = use_m == "-M" ? 'T' : 'F'
row.param_set['clean_sam'] = use_clean_sam ? 'T' : 'F'
Daniel E Cook
@danielecook
Feb 21 10:56
Thanks! One last question - is there a way to print the 'hash' used in caching for a file or val?
Paolo Di Tommaso
@pditommaso
Feb 21 10:57
?
Daniel E Cook
@danielecook
Feb 21 10:59
the md5sum or hex digest of an object
Paolo Di Tommaso
@pditommaso
Feb 21 11:01
use the standard java object.hashCode()
Martin Proks
@matq007
Feb 21 14:34
Has anyone had a problem when using -c parameter when running nextflow? For some reason when I try to run a command like this:
nextflow run NGI-RNAfusion -profile munin -c 'conf/munin-singularity.config' --reads 'data/reads_{1,2}.fq.gz' --genome GRCh38, the munin-singularity config doesn't see nextflow.config params scope. Any idea if this is a normal behavior or a bug?
I have a variable params.container_version in the munins-singularity.config which is defined in nextflow.config
Ernesto Lowy
@elowy01
Feb 21 17:21
Hi Paolo, Thanks for this nice piece of software!
Paolo Di Tommaso
@pditommaso
Feb 21 17:21
heyyy Ernesto !!
nice to see you here
Ernesto Lowy
@elowy01
Feb 21 17:22
Hi, I hope everything is fine at the CRG!
Paolo Di Tommaso
@pditommaso
Feb 21 17:23
it does
having fun with NF at ebi ?
Ernesto Lowy
@elowy01
Feb 21 17:23
yes, we are using it in some pipelines
Paolo Di Tommaso
@pditommaso
Feb 21 17:24
nice to hear that
Ernesto Lowy
@elowy01
Feb 21 17:24
I was wondering how this snippet:
Channel
.fromPath(params.file)
.splitCsv(header:true)
.map{ row-> tuple(row.url, file(row.dest), row.prefix) }
.set { paths_ch }
Can be used by two different processes
Paolo Di Tommaso
@pditommaso
Feb 21 17:25
replace .set { paths_ch } with .into { paths_ch1; paths_ch2 }
and use paths_ch1 and paths_ch2
however soon it will not be needed any more
Ernesto Lowy
@elowy01
Feb 21 17:25
ok, I will try
thanks!
Paolo Di Tommaso
@pditommaso
Feb 21 17:26
:+1:
Toni Hermoso Pulido
@toniher
Feb 21 21:15
Hi, sorry if a bit offtopic. I'm trying Nextflow in AWS, and when using AWS Batch I encounter that jobs fail with this error: 'CannotStartContainerError: Error response from daemon: OCI ...'
Anyone experienced this?