These are chat archives for nextflow-io/nextflow

10th
Apr 2019
sureshhewa
@sureshhewabi
Apr 10 08:02
@stevekm Thanks for sharing your solution
Hugues Fontenelle
@huguesfontenelle
Apr 10 09:07
Hi @pditommaso
When using docker, is there a way to skip the --entrypoint /bin/bash override in the docker run command of the generated .command.run?
and the answer is ? :D
Hugues Fontenelle
@huguesfontenelle
Apr 10 09:36
mmh.. no?
Hugues Fontenelle
@huguesfontenelle
Apr 10 09:53
I realize that skipping it would not work with any decent image that specifies an entrypoint (biocontainer etc).
One partner has a VM with docker where the --entrypoint docker param does not work. I guess they'll have to fix it.
Paolo Di Tommaso
@pditommaso
Apr 10 10:42
umm, not sure how to help, NF needs to use the bash interpreter, what's the point to use a different entry point?
Luca Cozzuto
@lucacozzuto
Apr 10 11:27
Hi @pditommaso, is it possible to have a max retry and then an ignore?
like try this 4 times and if not possible to make it, ignore it
Paolo Di Tommaso
@pditommaso
Apr 10 11:30
Luca Cozzuto
@lucacozzuto
Apr 10 11:30
thanks!
Paolo Di Tommaso
@pditommaso
Apr 10 11:31
:v:
Chelsea Sawyer
@csawye01
Apr 10 12:23

I'm having a bit of an issue using the publishDir directive with a program that almost automatically outputs the directory and file structure that I want. Cell Ranger outputs the directories as projectname/outs/fastq_path/sampleName/sampleName.fastq.gz and I would like it to be projectname/FastQ/sampleName/sampleName.fastq.gz. When it runs no output gets put into that folder that I thought I specified so I think I have it a bit wrong.
```process cellRangerMkFastQ {
tag "${sheet.name}"
publishDir path: "${params.outdir}", mode: 'copy',
saveAs: { filename ->
if (filename.endsWith("/outs/fastq_path//*.fastq.gz")) "FastQ/${filename.getParent().getParent().getName()}/${filename.getParent().getName()}/$filename"
else if (filename.endsWith("
/outs/fastqpath/Undetermined.fastq.gz")) "FastQ/$filename"
else if (filename.endsWith("
/outs/fastq_path/Reports")) "FastQ/$filename"
else if (filename.endsWith("*/outs/fastq_path/Stats")) "FastQ/$filename"}
input:
file sheet from samplesheet

output:
file "*/outs/fastq_path/*/**.fastq.gz" into cr_fastqs_count_ch, cr_fastqs_fqc_ch, cr_fastqs_screen_ch mode flatten
file "*/outs/fastq_path/Undetermined_*.fastq.gz" into cr_undetermined_default_fq_ch mode flatten

script:
"""
cellranger mkfastq --run ${runName_dir} --samplesheet ${sheet}
"""

}
```

Qi ZHAO
@likelet
Apr 10 12:23
hi @pditommaso , i have two conditional processes which may produce a channel for a next process (and i contact the channel later ). I also created two null channels in case that the one of the two processes not executed. I found that the pipe stucked when any of them was not run . Is there a way to go through this? thanks !
Qi ZHAO
@likelet
Apr 10 12:29
emm, ignoring this. I fixed it by change Channel.create() into Channel.empty()
Paolo Di Tommaso
@pditommaso
Apr 10 12:34
@csawye01 to format code triple `
new line
your-code
Chelsea Sawyer
@csawye01
Apr 10 12:38
Thanks @pditommaso! Here it is fixed:
I'm having a bit of an issue using the publishDir directive with a program that almost automatically outputs the directory and file structure that I want. Cell Ranger outputs the directories as projectname/outs/fastq_path/sampleName/sampleName.fastq.gz and I would like it to be projectname/FastQ/sampleName/sampleName.fastq.gz. When it runs no output gets put into that folder that I thought I specified so I think I have it a bit wrong.
process cellRangerMkFastQ {
    tag "${sheet.name}"
    publishDir path: "${params.outdir}", mode: 'copy',
             saveAs: { filename ->
               if (filename.endsWith("*/outs/fastq_path/*/**.fastq.gz")) "FastQ/${filename.getParent().getParent().getName()}/${filename.getParent().getName()}/$filename"
               else if (filename.endsWith("*/outs/fastq_path/Undetermined_*.fastq.gz")) "FastQ/$filename"
               else if (filename.endsWith("*/outs/fastq_path/Reports")) "FastQ/$filename"
               else if (filename.endsWith("*/outs/fastq_path/Stats")) "FastQ/$filename"}
    input:
    file sheet from samplesheet

    output:
    file "*/outs/fastq_path/*/**.fastq.gz" into cr_fastqs_count_ch, cr_fastqs_fqc_ch, cr_fastqs_screen_ch mode flatten
    file "*/outs/fastq_path/Undetermined_*.fastq.gz" into cr_undetermined_default_fq_ch mode flatten

    script:
    """
    cellranger mkfastq --run ${runName_dir} --samplesheet ${sheet}
    """
}
Paolo Di Tommaso
@pditommaso
Apr 10 12:38
good, much easier to read
now I have to understand your problem .. :D
therefore you have projectname/outs/fastq_path/sampleName/sampleName.fastq.gz
and you would like to publish it to projectname/FastQ/sampleName/sampleName.fastq.gz ?
Chelsea Sawyer
@csawye01
Apr 10 12:40
@pditommaso yes thats exactly it!
Paolo Di Tommaso
@pditommaso
Apr 10 12:41
well change the first if ..
well, actually it's not already doing that ?
Chelsea Sawyer
@csawye01
Apr 10 13:12
@pditommaso I'm not sure what you mean?
Adam Nunn
@bio15anu
Apr 10 13:17
@pditommaso very quick question: is it possible either to directly define or to access the current session runName from the nextflow.config file? I am struggling a bit with getting that to work.
KochTobi
@KochTobi
Apr 10 13:20
@bio15anu You can access workflow.runName in your main.nf script. I don't know about the config.
Adam Nunn
@bio15anu
Apr 10 13:31
The thing is I have had a user request to have work directories be named according to the different pipeline runNames, which sounded reasonable enough but now I'm not actually sure if this is possible. I think workDir can only be defined from command line or nextflow.config ?
Paolo Di Tommaso
@pditommaso
Apr 10 13:37
that would break the cache/resume mechanism
you can have different run names for the same work dir ..
Chelsea Sawyer
@csawye01
Apr 10 13:47
@pditommaso so just this?
process cellRangerMkFastQ {
    tag "${sheet.name}"
    publishDir path: "${params.outdir}", mode: 'copy',
             saveAs: { filename ->
               if (filename.endsWith("*/outs/fastq_path/*/**.fastq.gz")) "FastQ/$filename"
               else if (filename.endsWith("*/outs/fastq_path/Undetermined_*.fastq.gz")) "FastQ/$filename"

    input:
    file sheet from samplesheet

    output:
    file "*/outs/fastq_path/*/**.fastq.gz" into cr_fastqs_count_ch, cr_fastqs_fqc_ch, cr_fastqs_screen_ch mode flatten
    file "*/outs/fastq_path/Undetermined_*.fastq.gz" into cr_undetermined_default_fq_ch mode flatten

    script:
    """
    cellranger mkfastq --run ${runName_dir} --samplesheet ${sheet}
    """
}
Ólavur Mortensen
@olavurmortensen
Apr 10 14:00

I'm not sure if this is the proper place to ask something like my following question. If it isn't, please let me know.

I'm writing a nf script that I intend to run as a standalone script, that is, not within a docker container or anything like that. Does anyone have an idea how to load software in such a script? Maybe via config somehow?

KochTobi
@KochTobi
Apr 10 14:03
You could still use conda although it is very slow. What is the reason for deciding against a containerized solution?
Paolo Di Tommaso
@pditommaso
Apr 10 14:07
@csawye01 if you are asking about the syntax is missing a } ..
 saveAs: { filename ->
               if (filename.endsWith("*/outs/fastq_path/*/**.fastq.gz")) "FastQ/$filename"
               else if (filename.endsWith("*/outs/fastq_path/Undetermined_*.fastq.gz")) "FastQ/$filename" 
              }
also filename.endsWith("*/outs/fastq_path/*/**.fastq.gz") this is not supposed to work, wildcards are not expanded in this way
you can do if (filename.endsWith(".fastq.gz")) "FastQ/$filename" - OR - write a regex for that
with the following disclaimer https://xkcd.com/208/
Ólavur Mortensen
@olavurmortensen
Apr 10 14:12
@KochTobi Well, having everything containerized is great for pipelines, but for throwaway scripts I just don't want to bother with that. Maybe that's just because I'm not so skilled with Docker idk :D
Chelsea Sawyer
@csawye01
Apr 10 14:12
@pditommaso haha thats an excellent disclaimer! The closing tag } was there but I had removed a few things to simplify it to put on here and forgot to put that back in. I'll try your suggestion, thanks!
Paolo Di Tommaso
@pditommaso
Apr 10 14:12
:+1:
actually this was the correct citation !
:smile:
Chelsea Sawyer
@csawye01
Apr 10 14:15
@pditommaso haha even better!
Ólavur Mortensen
@olavurmortensen
Apr 10 16:23

I'm getting an error that I find quite bizarre. Nextflow seems to be splitting my into different lines, I don't understand why. I've included the process causing the error and the error report below (look at the "script" part in the process and the "command executed" in the error log).

Anyone know what's going on?

process make_new_samplenames {
    input:
    val nn from n_samples_ch

    output:
    file 'samplenames.txt' into new_samplenames_ch

    script:
    """ 
    make_new_samplenames.py $nn samplenames.txt
    """ 
}
ERROR ~ Error executing process > 'make_new_samplenames (1)'

Caused by:
  Process `make_new_samplenames (1)` terminated with an error exit status (1)

Command executed:

  make_new_samplenames.py 206
  samplenames.txt

Command exit status:
  1

Command output:
  (empty)

Command error:
  Traceback (most recent call last):
    File "/some/path/here/nf/bin/make_new_samplenames.py", line 6, in <module>
      filename = sys.argv[2]
  IndexError: list index out of range

Work dir:
  /some/path/here/work/76/f24affeaf6f9b7c9c7112121115ac5
micans
@micans
Apr 10 16:24
Looks like there is a newline in nn
Ólavur Mortensen
@olavurmortensen
Apr 10 16:25
how so?
special word?
micans
@micans
Apr 10 16:25
Command executed indicates that nn has the value "206\n"
what goes into n_samples_ch? I would check that.
Alternatively you can .map { it.trim() } that channel
Ólavur Mortensen
@olavurmortensen
Apr 10 16:30
Really new to nf, sorry, should I do like channel1.map { it.trim() }.into(channel2)?
micans
@micans
Apr 10 16:31
Can you try val nn from n_samples_ch.map { it.trim() } ?
I'm curious, what happens in the creation of n_samples_ch?
Ólavur Mortensen
@olavurmortensen
Apr 10 16:37
You were exactly right, there was a newline at the end of the nn variable. Basically, process runs a command that outputs a single number, and I send this number to a n_samples_ch via stdout. I just didn't think about that this could include a newline.
val nn from n_samples_ch.map { it.trim() } worked
Thanks :D
micans
@micans
Apr 10 16:37
Yes, it is a bit of a gotcha, I've been there as well
Ethan Bensman
@ebenz99
Apr 10 18:29

I'm running into a problem using nextflow with docker on a US-based Kubernetes research cluster. My process runs completely fine with directory-independent commands like 'ls' and 'pwd', but as soon as I try to run a script I know to be in my launch directory per my nextflow.config file, nextflow says it can't find the file I'm trying to run.

I feel pretty confidently that nextflow is working in the directory I expect it to be in, but I'm suspicious that it's not moving my docker image files into there. Anyone have any experience or advice with a problem like this?

Vladimir Kiselev
@wikiselev
Apr 10 18:48
@ebenz99 As far as I know, you need to put your scirpts in the bin directory and also use $baseDir everywhere you are using the launch directory
also worth trying to run it locally, does it throw the same error?
Ethan Bensman
@ebenz99
Apr 10 18:49
Ah ok, thanks, I'll work on trying to run it from bin. But yes, for what it's worth, it works fine locally