These are chat archives for nextflow-io/nextflow

17th
Jun 2015
ekageyama
@ekageyama
Jun 17 2015 11:52
hello, ive been trying to figure out this problem, but havent been able too
I have a folder with son gz files, and I want to map them, and then do stuff with the alignments, I am using sge
here is the code
process shore_import {

        input:
        val shore
        file lane
        file barcode

        output:
        //file 'mappings/1/sample*' into samples mode flatten
        file 'mappings/1/sample_*' into map_files
        """
        ${shore}/shore import -v fastq -x ${lane} -r ${barcode} -h 0 -D -c -g -n 5% -k 75 -B 1000000 --discard-trim-failures -o mappings
        """
}

process mapping {

        input:
        file map_files
        file genome
        val shore

        output:
        file "mappings/1/${samples}" into samples_consensus

        """
        ${shore}/shore mapflowcell -f ${samples} -i ${genome} -n 5% -g 5% -l 19 -s 1000 -P replace
        """
}
the output of the first part are just more folders , and are contained in the directory where the import run as mappings/1/sample_whatever
the first part runs fine, but after that, no job gets submitted
and I have checked that in the /work parth were the first command, the correct output is there, there is indeed the mappings/1/ folder where all the output folders are
Paolo Di Tommaso
@pditommaso
Jun 17 2015 12:08
Can you include the nextflow output when you run it ?
ekageyama
@ekageyama
Jun 17 2015 12:16
N E X T F L O W  ~  version 0.14.1
Launching RadSeq.nf
[warm up] executor > sge
[a4/566af1] Cached process > shore_import (1)
it stays like that forever, i found sort of a solution
if I remove the file genome and val shore it works
Paolo Di Tommaso
@pditommaso
Jun 17 2015 12:18
I see
now I'm understanding
I think the problem is that shore is a channel
isn't it ?
ekageyama
@ekageyama
Jun 17 2015 12:21
ahhh
yes, indeed that is the problem...thnx!
Paolo Di Tommaso
@pditommaso
Jun 17 2015 12:24
yes, channel cannot be shared across multiple processes
ekageyama
@ekageyama
Jun 17 2015 12:25
so, then I need a way to have a global "file" variable
is there any trivial way to do this?
Paolo Di Tommaso
@pditommaso
Jun 17 2015 12:28
you can use a variable as input in many process as long as it is not a channel variable
This message was deleted
This message was deleted
ekageyama
@ekageyama
Jun 17 2015 12:30
ok, i must be missing something, because shire looks like this:
scoring_matrix= file(params.scoring_matrix)
shore= file(params.shorepath)
lane= file(params.lane)
barcode = file(params.barcode)
genome= file(params.genome)
ahh, maybe its because im using it as val and its actually a file path?
Paolo Di Tommaso
@pditommaso
Jun 17 2015 12:31
actually in this way it should work
I would suggest to change it to file, though I don't think it is the problem
I'm wondering if it may be a bug
ekageyama
@ekageyama
Jun 17 2015 12:35
so, if I use params.shorepath directly it works
Paolo Di Tommaso
@pditommaso
Jun 17 2015 12:36
may you try if this works
process shore_import {

        input:
        file shore1 from shore
        file lane
        file barcode

        output:
        //file 'mappings/1/sample*' into samples mode flatten
        file 'mappings/1/sample_*' into map_files
        """
        ${shore1}/shore import -v fastq -x ${lane} -r ${barcode} -h 0 -D -c -g -n 5% -k 75 -B 1000000 --discard-trim-failures -o mappings
        """
}

process mapping {

        input:
        file map_files
        file genome
        file shore2 from shore

        output:
        file "mappings/1/${samples}" into samples_consensus

        """
        ${shore2}/shore mapflowcell -f ${samples} -i ${genome} -n 5% -g 5% -l 19 -s 1000 -P replace
      """
i.e. using shore variable but giving it two different names
ekageyama
@ekageyama
Jun 17 2015 12:37
but it seems that it also stalls with the variable genome
Paolo Di Tommaso
@pditommaso
Jun 17 2015 12:39
it's weird
ekageyama
@ekageyama
Jun 17 2015 12:42
ok, well, ill just use params :P
thanks a lot for the help!
Paolo Di Tommaso
@pditommaso
Jun 17 2015 12:42
:)
you are welcome
ekageyama
@ekageyama
Jun 17 2015 14:02
Ok, im here again, now Im having problems that there is no output
but the problem is my output is the same folder as the input, is there a way to make this possible?
something like this

process mapping {

    input:
    file map from map_files
    //file genome
    //val shore

    output:
    file map into samples_consensus

    """
    export GENOMEMAPPER=/ebio/abt6_projects8/solexa_tools/genomemapper/
    ${params.shorepath}/shore mapflowcell -f ${map} -i ${params.genome} -n 5% -g 5% -l 19 -s 1000 -P replace

    """

}

This message was deleted
ekageyama
@ekageyama
Jun 17 2015 14:17
so in the end, I jsut want to output the same folder I input it, the reason being is the pipeline I am using, always uses the same folder structure...I know, not very nice
Matthieu Foll
@mfoll
Jun 17 2015 15:53

Hi,
I read in the manual about moveTo that:

When a file with the same name as the target already exists, it will be replaced by the new one.

However I have a problem when the file already exists I can find in the log the following error:
Session aborted -- Cause: java.nio.file.FileAlreadyExistsException
I declare an output folder with

PDF_dir = file(params.bam_folder+'CALLS/PDF/')
PDF_dir.mkdirs()

I have a process emitting several files using file '*.pdf' into PDF and then I do

PDF.flatMap().subscribe { it.moveTo(PDF_dir) }

Am I doing something wrong?
Thanks!