These are chat archives for nextflow-io/nextflow

22nd
Apr 2016
Mike Smoot
@mes5k
Apr 22 2016 18:35
Hi, I'm wondering if there's a way to reuse a process in a pipeline, but with different input channels? My situation is that I'm doing a bowtie2 mapping on one channel of read pairs with one index and I'd like to run the same bowtie2 command on a separate channel of read pairs with a different index. The process would be the same in both cases, just with a different channel. Is this possible?
Paolo Di Tommaso
@pditommaso
Apr 22 2016 18:39
Hi, do you mean different index files, but the inputs definitions are the same, right?
Mike Smoot
@mes5k
Apr 22 2016 19:23
Yes, different index files and different read pairs. If I were to cut/paste a process, everything would be identical except for the names of the channels.
Paolo Di Tommaso
@pditommaso
Apr 22 2016 19:25
well, you don't need to duplicate the process for that
the same process can be executed indefinitely time, just feeding it with different input sets
Mike Smoot
@mes5k
Apr 22 2016 19:37
Ok, that sounds perfect, although I'm not quite sure how to code it... Can you point me to an example where one process is called with two different channels?
Paolo Di Tommaso
@pditommaso
Apr 22 2016 19:38
I would suggest to start with this
It should not be too difficult to modify it to handle multiple indexes
Mike Smoot
@mes5k
Apr 22 2016 19:47
That's more-or-less what I'm doing, but in my case I need to call mapping twice. Here is (roughly) my code.
process mapping_silvassu {
    input:
    set sample, file(read1), file(read2) from read_pairs
    val threads
    val index from silvassu_index
    val name from silvassu_name

    output:
    set sample, "${sample}.${name}.bam" into bam

    """
    bowtie2 -k 1 --very-fast-local --threads ${threads} -x ${index} -1 ${read1} -2 ${read2} 2> ${sample}.${name}.mapping.stats | samtools view -@ ${threads} -bS -o ${sample}.${name}.bam - 
    """
}

process mapping_univec {
    input:
    set sample, file(read1), file(read2) from silva_reads
    val threads
    val index from univec_index
    val name from univec_name

    output:
    set sample, "${sample}.${name}.bam" into univec_bam

    """    bowtie2 -k 1 --very-fast-local --threads ${threads} -x ${index} -1 ${read1} -2 ${read2} 2> ${sample}.${name}.mapping.stats | samtools view -@ ${threads} -bS -o ${sample}.${name}.bam - 
    """
}
Sorry, that didn't come out too clearly.
Paolo Di Tommaso
@pditommaso
Apr 22 2016 19:52
OK
You can reduce it to a single process, the main process is to feed it with the appropriate inputs
I would try something like this

process mapping_univec {
    input:
    set name, index, sample, file(read1), file(read2) from index_and_reads

    output:
    set sample, "${sample}.${name}.bam" into univec_bam

    """    bowtie2 -k 1 --very-fast-local --threads ${task.cpus} -x ${index} -1 ${read1} -2 ${read2} 2> ${sample}.${name}.mapping.stats | samtools view -@ ${threads} -bS -o ${sample}.${name}.bam - 
    """
}
the problem became to create a channel emitting a matching index and read pairs
Paolo Di Tommaso
@pditommaso
Apr 22 2016 19:59
the number of indexes is fixed (2) ?
or it can be any ?
Mike Smoot
@mes5k
Apr 22 2016 20:00
It's just one index per call.
Paolo Di Tommaso
@pditommaso
Apr 22 2016 20:01
thus, you should create a channel emitting a tuple for each of them
each tuple holds the index and the read files
this depends how you want to match the index file with the reads
Mike Smoot
@mes5k
Apr 22 2016 20:03
Hmmm. I think I see where you're going, however the second set of reads is produced as part of the pipeline so: initial reads -> first mapping -> bunch more stuff creating a new channel -> second mapping.
Paolo Di Tommaso
@pditommaso
Apr 22 2016 20:05
ah, the second mapping process depends on output of the previous.
Mike Smoot
@mes5k
Apr 22 2016 20:05
Yeah, sorry that wasn't clear
Paolo Di Tommaso
@pditommaso
Apr 22 2016 20:06
if so, I would suggest to duplicate it because it's logically a different step
it you don't want to duplicate the bowtie command line put it into a external bash script or a template file
tip: any executable script in the PROJECT_ROOT/bin is automatically added to the process PATH
Mike Smoot
@mes5k
Apr 22 2016 20:08
yeah, duplicating the command was the primary concern. I'll make a separate script for that. Thanks for your help!
Paolo Di Tommaso
@pditommaso
Apr 22 2016 20:08
:+1: