These are chat archives for nextflow-io/nextflow

9th
Jan 2019
Stephen Kelly
@stevekm
Jan 09 02:45
@pditommaso the cartesian product generated from combine does not give me the combinations that have <n number of args. For example using that set of params, I would not get the combinations "param1", "param1, param2", "param2, param3", etc.
Stephen Kelly
@stevekm
Jan 09 02:56
It also does not give me e.g. "param1, param2, param3"
Stephen Kelly
@stevekm
Jan 09 03:18
ok think I got it
Channel.from([
[['param1', '-arg 1']],
[['param2', '-arg 2']],
[['param3', '-arg 3']],
[['param4', '-arg 4']],
]).into { myparams_ch0; myparams_ch1; myparams_ch2; myparams_ch3; myparams_ch4 }
Channel.from('foo').into { input_ch1; input_ch2; input_ch3 }

// create cartesian product from 4 copies of the input channel
myparams_ch1.combine(myparams_ch2)
    .combine(myparams_ch3)
    .combine(myparams_ch4)
    .map { set1, set2, set3, set4 ->
        // remove duplicate params from each set
        def unique = [ set1, set2, set3, set4 ] as Set
        return(unique)
    }
    .unique() // remove duplicates outputs
    .set { combined_params }

process run3 {
    tag "${params}"
    echo true
    input:
    set val(x), val(params) from input_ch3.combine(combined_params)

    script:
    val1 = params.collect { it[0] }.join('.')
    val2 = params.collect { it[1] }.join(' ')
    """
    echo "${val1}: ${val2}"
    """
}
Rad Suchecki
@rsuchecki
Jan 09 05:04
How about something like this @stevekm
def args = ['-arg 1', '-arg 2', '-arg 3', '-arg 4']

comb = []
1.upto(args.size()) {
    [args].multiply(it).eachCombination { list ->
      if(list.size() == 1 || (1..<list.size()).every { list[it - 1] < list[it] }) {
           comb << list
      }
    }
}
Channel.from(comb).subscribe { println "$it" }
Stephen Kelly
@stevekm
Jan 09 05:32
oh @rsuchecki that looks like the exact result I want, but is there a way to make it work with [["param1", '-arg 1'], ["param2", '-arg 2'], ["param3", '-arg 3'], ["param4", '-arg 4']] or ["param1":'-arg 1', "param2":'-arg 2', "param3":'-arg 3', "param4":'-arg 4']? I need to keep track of labels for each arg. I am not that good with Groovy though
Rad Suchecki
@rsuchecki
Jan 09 05:43
I think the only thing to change is the comparison,
list[it - 1] < list[it]} to e.g. list[it - 1][0] < list[it][0]
this comparison is just a way to keep param sets unique,
the above is for input like: def args = [['param1', '-arg 1'], ['param2', '-arg 2'], ['param3', '-arg 3']]
Stephen Kelly
@stevekm
Jan 09 06:01
thanks that works!
Rad Suchecki
@rsuchecki
Jan 09 06:02
:+1:
Anand Mayakonda
@PoisonAlien
Jan 09 10:03
Hi, Just started with nextflow and groovy. Great stuff. Could any one recommend me how do I write output from fromFilepairs to an output csv ?
This is what I have done.
reads_ch = Channel
  .fromFilePairs(params.reads)
  .splitCsv()
Daniel E Cook
@danielecook
Jan 09 10:05
@PoisonAlien why are you trying to output reads to a csv? What format are the reads in?
Anand Mayakonda
@PoisonAlien
Jan 09 10:06
One of the aligner (gemBS) requires input files to be in a csv file. I am trying to geenrate this file from within nextflow pipeline.
For example this is how the input fire for the gemBS looks. I am trying to generate this in groovy based on output from fromFilePairs
File1,File2,Barcode,file_id
fastq/AS-277115-LR-38819_R1.fastq.gz,/fastq/AS-277115-LR-38819_R2.fastq.gz,AS-277115-LR-38819,AS-277115-LR-38819
Oops, sorry I am not trying to write the reads themselves to csv, I just want the file names to be in csv file - which in turn will be used as an input for the aligner.
Anand Mayakonda
@PoisonAlien
Jan 09 11:13

Okay, I got it.

reads_ch = Channel.fromFilePairs(params.reads)

process make_csv{

  publishDir file("01_gemBS_input")

  input:
    set sample_name, fastq from reads_ch

  output:
    set file("${sample_name}.csv") into gemBS_inputs

  script:
    fastq = fastq.join(',')

  """
  echo "File1,File2,Barcode,file_id" > ${sample_name}.csv
  echo "${fastq},${sample_name}" >>  ${sample_name}.csv
  """

}

I just made a channel for creating one.

Anthony Underwood
@aunderwo
Jan 09 15:01

Hi @pditommaso. I am in the process of converting a pipeline to the beta module feature. So far I'm sold!!
I am getting an error with a process that has 2 inputs

  input:
  set sample_id, min_read_length, file(reads)
  file('adapter_file.fas')

I've tried to call this as

trimmed_reads = min_read_length_and_raw_fastqs.trime_reads(adapter_file)

and

trimmed_reads = trim_reads(min_read_length_and_raw_fastqs, adapter_file)

Both result in the error

Caused by: java.lang.IllegalStateException: Missing 'bind' declaration in input parameter

micans
@micans
Jan 09 15:03
it seems to have three inputs?
sample_id
Anthony Underwood
@aunderwo
Jan 09 15:04
I have 2 channels as inputs
  1. This is made by joining 2 channels
    min_read_length_and_raw_fastqs = min_read_length.join(raw_fastqs_for_trimming)
  2. 2nd channels is a single item channel containing a file
Anthony Underwood
@aunderwo
Jan 09 15:10
scratch that - schoolboy error!! Hadn't change process to processDef
Paolo Di Tommaso
@pditommaso
Jan 09 15:43
ok, so far is still very experimental, I would like to keep process without introducing a new keyword
feel free to comment on the issue nextflow-io/nextflow#984
k-hench
@k-hench
Jan 09 15:49
Hi everyone,
this might be a silly question - but is there a way to specify several publishDir locations for different output channels within a single process?
Paolo Di Tommaso
@pditommaso
Jan 09 15:49
yes, use several publishDirs
k-hench
@k-hench
Jan 09 15:56
thanks - that was embarrassingly simple....
Paolo Di Tommaso
@pditommaso
Jan 09 15:57
:wink:
Tobias Neumann
@t-neumann
Jan 09 16:39
@pditommaso Is there any update on the issue that a s3 copy process to a s3 publishDir from an s3 workDir takes forever? Much longer than the actual upload + computation?
Anthony Underwood
@aunderwo
Jan 09 18:48

@pditommaso I have just finished converting my bacterial mapping and phylogeny generation workflow to one that imports modules

Very easy after a few initial hiccups and the code is SO clean now.
The main nf file went from 300 lines to 60 and the overall process is so much easier to see at a glance now.

This was the one niggle I had with NF and now that's gone too. :thumbsup:

OK I could wish it was in Python but that would be just plain greedy!!

Stephen Kelly
@stevekm
Jan 09 23:21
so I am looking at deploying our pipelines on a cloud platform in addition to our HPC; is there any consensus on the Google vs. Amazon support from Nextflow? Google support just came out for Nextflow, is it as robust and mature as the AWS support in Nextflow?