These are chat archives for nextflow-io/nextflow

19th
May 2017
Anton Goloborodko
@golobor
May 19 2017 00:43
Hi, Paolo! Just wanted to update - seems like there was a mixup with Docker versions. The machine with a frozen task also had the standard ubuntu docker.io (1.12 I believe). I initially reported 17.05.0-ce b/c the version has been updated after tasks were found to freeze. Seems like it's working with the new docker, which means that you were right about the cause of the problem! Thank you a lot, we'll keep you updated if some new interesting behavior comes up.
Paolo Di Tommaso
@pditommaso
May 19 2017 07:17
:+1:
Maxime Garcia
@MaxUlysse
May 19 2017 08:01
Hello
Paolo Di Tommaso
@pditommaso
May 19 2017 08:02
good morning
Maxime Garcia
@MaxUlysse
May 19 2017 08:02
I noticed an unexpected behavior when using a directory as an output
with publishDir
I'm trying to make a reproducible small example to make an issue on github
Paolo Di Tommaso
@pditommaso
May 19 2017 08:03
that's the best thing
Maxime Garcia
@MaxUlysse
May 19 2017 08:04
But basically, when I first run my pipeline, my publishDir will contain the directory as intended
and when I'm resuming, my publishDir will contain the output directory which will also contain the output directory
Paolo Di Tommaso
@pditommaso
May 19 2017 08:05
looks weird, please create a replicate test case
Maxime Garcia
@MaxUlysse
May 19 2017 08:05
I'm on it ;-)
Thanh Lê
@thanhleviet
May 19 2017 08:05
good morning! Which editor you guys suggest for writing nextflow? Thanks
Maxime Garcia
@MaxUlysse
May 19 2017 08:06
Atom / sublime
Paolo Di Tommaso
@pditommaso
May 19 2017 08:09
you can use any editor having groovy support, tho it won't help with the NF related syntax constructs
Thanh Lê
@thanhleviet
May 19 2017 08:13
I’m using sublime with Groovy syntax. So far so good. But I meant any tricks/tips for being more productive :)
Maxime Garcia
@MaxUlysse
May 19 2017 08:25
@pditommaso submitted #342 ;-)
Simone Baffelli
@baffelli
May 19 2017 08:25
How do I prevent nextflow from recognizing this expression as a command in a "shell" context?
file_ls = ["rate_geo":rate,"sig_rate_geo":sig_rate, "sig_ph_geo":sig_ph]
Paolo Di Tommaso
@pditommaso
May 19 2017 08:26
@MaxUlysse thanks :sunglasses:
Simone Baffelli
@baffelli
May 19 2017 08:26
I guess everything that appears like I string gets parsed as if it were a shell command
Paolo Di Tommaso
@pditommaso
May 19 2017 08:28
@baffelli um, how are using in the script ?
Simone Baffelli
@baffelli
May 19 2017 08:29
I want to build a large command where I repeat the same operation for those files, but I don't want to define a separate process that receives each of them
like that
process geo_stacked{

  publishDir "$params.results/stacked/${stack_id}"

  input:
    set file(off_par), file(rate), file(sig_rate), file(sig_ph), val(stack_id) from stacked
    //geocoding data should be repeated for each file
    each gc_set from gc_for_stack

    output:
      set file(rate_geo), file(sig_rate_geo), file(sig_ph_geo) into gc_stack

    script:
      //We get the data from the geocoding info
      dem_seg_par = gc_set['dem_seg_par']
      lut = gc_set['lut']
      //we want to geocode all three stacking outputs
      file_ls = ["rate_geo":rate,"sig_rate_geo":sig_rate, "sig_ph_geo":sig_ph]
      file_ls.each{
        outpath,infile ->
          println("Geocoding ${infile}")
          '''
            wd=$(get_value !{off_par} interferogram_width)
            nl=$(get_value !{off_par} interferogram_azimuth_lines)
            tab_wd=$(get_value !{dem_seg_par} width)
            geocode_back !{infile} ${wd} ${lut} temp_geo ${nl} 0
            data2geotiff !{dem_seg_par} temp_geo 2 !{outpath}
          '''
        }


}
Paolo Di Tommaso
@pditommaso
May 19 2017 08:30
I see
in this case you need to concatenate the command into a string variable and return it as the last statement after the loop
eg
script: 
  def cmd = ''
  for( something ) {
    cmd =+ 'do this and that \n'
  }
  return cmd
you may want to put that piece of code in a separate function to make it more readable
Simone Baffelli
@baffelli
May 19 2017 09:19
Great, thanks!
Paolo Di Tommaso
@pditommaso
May 19 2017 09:23
:+1:
Simone Baffelli
@baffelli
May 19 2017 09:25
I must say nextflow is incredible. I took me much longer to get started with other tools!
Paolo Di Tommaso
@pditommaso
May 19 2017 09:25
you are very welcome!
thanks a lot, spread the word :D
Simone Baffelli
@baffelli
May 19 2017 09:26
I'm trying! I just told a colleague about it. But in our research community people look quite skeptically at those tools
they still believe they would make their work harder
:laughing:
Paolo Di Tommaso
@pditommaso
May 19 2017 09:27
I know very well
I you can blog about it, that would be great !
Simone Baffelli
@baffelli
May 19 2017 09:27
I will first of all put a reference about it in my upcoming onference poster ;)
Paolo Di Tommaso
@pditommaso
May 19 2017 09:28
well done
Sergey Venev
@sergpolly
May 19 2017 18:39
Hello! Distiller user here. I guess we have a problem that is on the nextflow side... It is related to space in filenames.
I'm running everything in the Dropbox folder, which has some spaces and "()" in it, and apparently nextflow creates symlinks without proper escaping such characters (is it even possible for symlinks? - no idea).
I'll try to explain out example:
Here is the code of the process that is being executed:
process map_runs {
    tag "library:${library} run:${run} chunk:${chunk}"
    storeDir getIntermediateDir('bam_run')

    cpus params.map_cpus

    input:
    set val(library), val(run), val(chunk), file(fastq1), file(fastq2) from LIB_RUN_CHUNK_FASTQ
    set val(bwa_index_base), file(bwa_index_files) from BWA_INDEX.first()

    output:
    set library, run, "${library}.${run}.${chunk}.bam" into LIB_RUN_CHUNK_BAMS

    """
    bwa mem -t ${task.cpus} -SP \"${bwa_index_base}\" \"${fastq1}\" \"${fastq2}\" \
        | samtools view -bS > ${library}.${run}.${chunk}.bam \
        | cat
    """
}
Paolo Di Tommaso
@pditommaso
May 19 2017 18:43
ohhhh, path with blanks :(
Sergey Venev
@sergpolly
May 19 2017 18:43
Hi, again
Yes
Paolo Di Tommaso
@pditommaso
May 19 2017 18:44
hi
Sergey Venev
@sergpolly
May 19 2017 18:44
I know that the best way is NO blanks, but still
is it fixable at all?
Paolo Di Tommaso
@pditommaso
May 19 2017 18:45
anyhow blanks should be supported
Sergey Venev
@sergpolly
May 19 2017 18:45
in that example, nextflow creates a bunch of symlinks to indexed genome
set val(bwa_index_base), file(bwa_index_files) from BWA_INDEX.first()
bwa_index_files - this thing
and these links are being created in a working folder
Paolo Di Tommaso
@pditommaso
May 19 2017 18:46
please open a issue on GitHub providing a test case that replicate the issue
Sergey Venev
@sergpolly
May 19 2017 18:47
I'd be able to escape bwa_index_base and fastq1/2 using something link \"$(redlink -f something)\" ...
Sure
I'm kind of new to this collaborative githubbing etc, but I'll try my best
Paolo Di Tommaso
@pditommaso
May 19 2017 18:49
nothing complicated, I need an example that allows me to replicate the problem
Sergey Venev
@sergpolly
May 19 2017 18:49
Sure sure, I'll create an example