These are chat archives for nextflow-io/nextflow

29th
Apr 2016
Jason Byars
@jbyars
Apr 29 2016 20:51
Hi, if I have a script section calculate a few values for me in R. What is the sanest way to get that into an output channel? It seems like my options are stdout and parse it in the next process or spit out a json file, grab the values out of it, and shove those in the channel. Is there an easier way?
here's the example of what I'd like to do:
process get_stats {
  tag { bf }
  time '4h'

  input:
  file bf from bamfiles

  output:
  set file(bf),val(mean_read_length), val(sd_read_length) into stats

  scripts:
  """
  #!/usr/bin/Rscript --vanilla
  library(Rsamtools)
  library(jsonlite)

  param <- ScanBamParam(what="qwidth")
  bam <- scanBam(${bf}, param=param)[[1]]
  mean_read_length <- mean(bam\$qwidth, na.rm=T)
  sd_read_length <- sd(bam\$qwidth, na.rm=T)
  """
}
what is the easiest way to get the mean_read_length and sd_read_length values into the output channel?
Paolo Di Tommaso
@pditommaso
Apr 29 2016 22:18
Hi, that would be nice but it's not possible because that are variable in the context of R
thus cannot be captured by nextflow
you can print to the stdout and save that values to a file, I usually prefer csv to json
Jason Byars
@jbyars
Apr 29 2016 22:20
ok, just wanted to make sure there wasn't some pass back mechanism I was overlooking.
I do as well, but half of my reality is going json at the moment...
Paolo Di Tommaso
@pditommaso
Apr 29 2016 22:21
as long you have that values in a channel you can manipulate them also with one or more operators
that's usually easier
I do as well, but half of my reality is going json at the moment...
:)
json is the new xml
Jason Byars
@jbyars
Apr 29 2016 22:23
right, so is there a recommended preference to process the stdout output on the process that generated it vs parsing it as input during the next process?
Paolo Di Tommaso
@pditommaso
Apr 29 2016 22:26
for a few values I would just print them to the stdout
Jason Byars
@jbyars
Apr 29 2016 22:27
ok. When it comes to resuming a failed run, if a process simply dumps to stdout, would that process have to run again, or would the stdout channel information be preserved?
Paolo Di Tommaso
@pditommaso
Apr 29 2016 22:28
no won't run it again
also the process stdout is cached
Jason Byars
@jbyars
Apr 29 2016 22:29
ok, that is really useful to know.
Jason Byars
@jbyars
Apr 29 2016 22:36
now I just have a bit of R invocation weirdness to work out. When I run .command.sh for the failed R script everything works fine. When I launch the script with nextflow I get missing package error from R. It's almost like the library paths change
Paolo Di Tommaso
@pditommaso
Apr 29 2016 22:40
the best way is to debug it with bash -x .command.run