These are chat archives for nextflow-io/nextflow

18th
Jun 2018
Maxime Garcia
@MaxUlysse
Jun 18 2018 14:20
Hej everyone, quick and stupid question, I'm trying to get rid of the .vcf.gz extension of a file, the .baseName only remove the .gz do you anything else than it.baseName - ".vcf" or any substring removal to remove everything?
Paolo Di Tommaso
@pditommaso
Jun 18 2018 14:25
use file.simpleName instead
Maxime Garcia
@MaxUlysse
Jun 18 2018 14:28
Ooooh
Good one
Paolo Di Tommaso
@pditommaso
Jun 18 2018 14:31
:+1:
Powered by NF.
Paolo Di Tommaso
@pditommaso
Jun 18 2018 15:53
:clap: :clap: :clap: :clap:
when twitter will add a modify button it will be always too late ..
Jemma Nelson
@fwip
Jun 18 2018 16:40
What's the best-practice for an optional stage in a pipeline? I.e: the pipeline can be A -> B -> C, or A -> C. (In this case, B is trimming fastq to a certain length). The first approach I can see is a when: block on B, with a corresponding if statement to simply forward the output of A to C's input. The second approach I see is an if block inside B's script, just creating a symlink from input to output if it doesn't need to be run. examples: https://gist.github.com/fwip/5cb28c350e8f3d9e2e2a1f7fc5762eda
Jemma Nelson
@fwip
Jun 18 2018 16:49
So in the second process (when trimming doesn't need to be done), I should just do the ln -s $input $output bit?
Paolo Di Tommaso
@pditommaso
Jun 18 2018 16:51
umm, no
let me think a bit
Jemma Nelson
@fwip
Jun 18 2018 16:52
ohh, I misunderstood the mix in the omega process
Paolo Di Tommaso
@pditommaso
Jun 18 2018 16:54
well all the point is the input for the downstream process
yon can do
process trim_to_length {
  input:
  file(r1) from fastq

  output:
  file('r1.trim.fastq.gz') into trimmed_fastq

  when:
  params.trim_to != 0

  script:
  """
  zcat $r1 | awk 'NR%2==0 {print substr(\$0, 1, $params.trim_to)} NR%2!=0' | gzip -c -1 > r1.trim.fastq.gz
  """
}

ch1 = params.trim_to ?  trimmed_fastq : fastq
tho not so sure it works
Jemma Nelson
@fwip
Jun 18 2018 16:58
That seems like a pretty good approach, thank you :)
micans
@micans
Jun 18 2018 17:03

That's interesting. I've tried an approach with processes inside if statements:

if (params.runid != null && params.lane != null) {
  sample = params.runid + '-' + params.lane
  process from_runid {
      output:
          set file('*.cram') optional true into cram_files

      script:
      """
      echo ${params.runid} > 1.cram
      echo ${params.lane} > 2.cram
      echo ${params.runid}-${params.lane} > 3.cram
      ln -s 2.cram 4.cram
      """
  }
}

(this is my mini-testing environment -- other branching omitted), to achieve the same. This approach works as well, but I did wonder if there is a canonical way of doing this and whether my approach has irredeemable flaws.

Paolo Di Tommaso
@pditommaso
Jun 18 2018 17:04
no my example wont work
Paolo Di Tommaso
@pditommaso
Jun 18 2018 17:09
Now I like it!
params.trim_to = true

Channel.fromPath('.data/reads/*_1.fq.gz').set{ fastq_ch }

trim_ch = params.trim_to ? fastq_ch : Channel.empty()
no_trimmed_ch = !params.trim_to ? fastq_ch : Channel.empty()

process trim_to_length {
  input:
  file(r1) from trim_ch

  output:
  file('r1.trim.fastq.gz') into trimmed_fastq

  script:
  """
  zcat $r1 | awk 'NR%2==0 {print substr(\$0, 1, $params.trim_to)} NR%2!=0' | gzip -c -1 > r1.trim.fastq.gz
  """
}


process next_stage {
  input: 
  file x from trimmed_fastq.mix(no_trimmed_ch)
  """
  echo your_command --input $x
  """
}
micans
@micans
Jun 18 2018 17:12
Nice, empty channel won't trigger the process
Paolo Di Tommaso
@pditommaso
Jun 18 2018 17:12
exactly
micans
@micans
Jun 18 2018 17:12
:+1: :beers:
Off home, I'll be lurking here tomorrow again. Have a great evening
Félix C. Morency
@fmorency
Jun 18 2018 17:13
Cya!
Paolo Di Tommaso
@pditommaso
Jun 18 2018 17:14
:wave: :wave:
Mike Smoot
@mes5k
Jun 18 2018 17:18

I reworked the conditional-process pattern similarly:

params.flag = false

if (params.flag) {
    Channel.empty().set{foo_inch}
    Channel.from(1,2,3).set{bar_inch}
} else {
    Channel.from(4,5,6).set{foo_inch}
    Channel.empty().set{bar_inch}
}

process foo {

  input:
  val(f) from foo_inch

  output:
  file 'x.txt' into foo_ch

  script:
  """
  echo $f > x.txt
  """
}

process bar {
  input:
  val(b) from bar_inch

  output:
  file 'x.txt' into bar_ch

  script:
  """
  echo $b > x.txt
  """
}

process omega {
  echo true
  input:
  file x from foo_ch.mix(bar_ch)

  script:
  """
  cat $x
  """
}

If you want I can make this a pull request.

Paolo Di Tommaso
@pditommaso
Jun 18 2018 17:20
Nice, I propose to add it as alternative solution
Mike Smoot
@mes5k
Jun 18 2018 17:22
Yes, just what I was thinking. I'll see if I can put together a pull request to the patterns repo for you to review.
Paolo Di Tommaso
@pditommaso
Jun 18 2018 17:23
:+1:
Jemma Nelson
@fwip
Jun 18 2018 17:29
Thanks y'all :D