These are chat archives for nextflow-io/nextflow

21st
Jan 2019
Stephen Kelly
@stevekm
Jan 21 02:56

Is there a way to write a Groovy expression that will allow me to set process.queue if params.queue was passed, but leave it unset if it wasnt? For example:

process.queue = {
    if(param.queue != '') param.queue 
    else Null
}

??

is that valid?
Stephen Kelly
@stevekm
Jan 21 03:25

so no way we can tell the users which steps in the process nextflow is working on, like giving a bit of info during its run?

You would probably want the http weblog feature; https://www.nextflow.io/docs/latest/tracing.html#weblog-via-http

however you would also need an API or web server application of some sort to catch the messages, at one point I had worked on one here: https://github.com/stevekm/nf-dashboard
Its not terribly complicated so you would prob just want to build your own
Otherwise, I typically just read the stdout log from Nextflow itself. You can tee it to a separate file if you dont want it attached to your terminal window; nextflow run pipeline.nf 2>&1 | tee -a "${ LOGFILE}"

Rad Suchecki
@rsuchecki
Jan 21 04:34
process.queue = { params.queue == null ? null : params.queue }
or something along those lines?
@stevekm
This message was deleted
and since default value for params.queue should be set,
params.queue = ''
process.queue = { params.queue == '' ? null : params.queue }
Stephen Kelly
@stevekm
Jan 21 06:27
ok yeah that is what I was thinking, but I was also not sure if setting it to null would actually prevent it from being used or if it would try to coerce null to a string as the name of a queue..
Rad Suchecki
@rsuchecki
Jan 21 08:01
Didn't check but assumed this is the internal default
rfenouil
@rfenouil
Jan 21 08:39
Hello, quick question:
When I use Channel.fromPath(...).groupBy(...).flatMap() each emitted element is an associative array with a single key (possibly referring to several paths).
Is it ok to use that as 'file' input for a process (staging) ?
Paolo Di Tommaso
@pditommaso
Jan 21 11:22
groupBy was an old experiment, use groupTuple instead
micans
@micans
Jan 21 12:01

I'm experimenting with files, filter functions, and failures ... one question that comes up is this: If I have

  errorStrategy { task.attempt <= 2 ? 'retry' : 'ignore' }
  maxRetries 1

I know this doesn't make a lot of sense (discrepancy between maxRetries and hardcoded 2), but with this my run terminates rather than ignores errors as I expected. (TBH this is a slight distraction, I'm trying to replicate other behaviour).

rfenouil
@rfenouil
Jan 21 12:02
@pditommaso Thank you, will try
rfenouil
@rfenouil
Jan 21 12:20
groupTuple works like a charm :)
Paolo Di Tommaso
@pditommaso
Jan 21 12:32
:+1:
rfenouil
@rfenouil
Jan 21 12:36
Another question:
I use ${ fileInputTab.collect{ it.name.inspect() } } in a script block to quote file names from a list. Is there a simpler way of doing this ?
File names must be quoted for this bash command because they may contain spaces
Paolo Di Tommaso
@pditommaso
Jan 21 12:37
I would do
${ fileInputTab.collect{ "'$it'"}.join(' ') }
rfenouil
@rfenouil
Jan 21 12:39
Great I like it. Thank you !
(I forgot to copy the join in my question, sorry for the non-functional example)
micans
@micans
Jan 21 13:07
@pditommaso can I humbly ask you the question a little bit above ...
Paolo Di Tommaso
@pditommaso
Jan 21 13:09
(in a call now)
micans
@micans
Jan 21 13:10
:+1: (sorry)
thought you might have missed it!
Timothy R. Fallon
@photocyte
Jan 21 14:02
Is there a recommended way to get the current contained path from a queue channel? I noticed if you use the queue channel directly in the tag{} directive, it print out the path, but I want to get the path in the Groovy code part of the nextflow script for a regex.
Shellfishgene
@Shellfishgene
Jan 21 14:17
Hi, I'm trying to get the nf-core RNASeq pipe installed. I have beforeScript = 'module load java1.8.0 singularity2.4.4' in my config, but still get bash: singularity: command not found. Should that work or does nextflow need singularity before the beforeScript stuff is executed, to download the image?
Stephen Kelly
@stevekm
Jan 21 14:31
you need to put the modules in the module directive, like this: module = 'java1.8.0:singularity2.4.4'
if you read through the .command.run files that are produced, it will make a lot more sense, since these are the scripts that actually get run
Shellfishgene
@Shellfishgene
Jan 21 14:32
Is the module directive not specific to a single process?
Stephen Kelly
@stevekm
Jan 21 14:32
it depends on how you declare it
and where
put it in your nextflow.config
Shellfishgene
@Shellfishgene
Jan 21 14:33
Ok, I copied the beforeScript version from another config file.
Stephen Kelly
@stevekm
Jan 21 14:33
and then you can do process.module = 'java1.8.0:singularity2.4.4' if you want it for all modules
Shellfishgene
@Shellfishgene
Jan 21 14:34
I have another problem, I get /bin/bash: line 0: cd: /work_beegfs/user/RNASeq/work/69/ba35ccdbbcf2e67eb5957b76741420: No such file or directory. However the directory is there, and contains the needed links to the data files, at least after nextflow stops.
Stephen Kelly
@stevekm
Jan 21 14:44
yeah you need to enable the auto-mount setting for Singularity, its in the Nextflow Singularity configs docs I think
Shellfishgene
@Shellfishgene
Jan 21 14:48
Hmm, now I get WARNING: Skipping user bind, non existent bind point (directory) in container: '/work_beegfs/user/RNASeq'
Paolo Di Tommaso
@pditommaso
Jan 21 14:49
@micans yes, it terminate because you specified maxRetries 1 therefore after the first it should not retry any more => fail
Stephen Kelly
@stevekm
Jan 21 14:49
yes for Singularity you need to pre-create the matching bind point directories inside the container
there are ways to avoid this requirement for Singularity but it requires changing some security settings that your admins probably will not like
Shellfishgene
@Shellfishgene
Jan 21 14:50
Ok, I'm new to singularity. That sounds like having nextflow download the singularity image will not work?
Paolo Di Tommaso
@pditommaso
Jan 21 14:50
Singularity you need to pre-create the matching bind point directories inside the container
only for old kernel not supporting file system overly I would add
Stephen Kelly
@stevekm
Jan 21 14:51
^^ yes. Which is what I had when we were using Singularity v. 2.4 lol
Shellfishgene
@Shellfishgene
Jan 21 14:52
So singularity 3.0 fixes this issue? Or only with a newer linux kernel?
Stephen Kelly
@stevekm
Jan 21 14:52

If the downloaded version of the container does not work then you can simply make your own. Lots of examples here https://github.com/NYU-Molecular-Pathology/containers

just set the 'from' section of the container to the one you want to download, then you can just make the directory inside the recipe and build your own version

p sure its a kernel issue because the feature has been available for a long time given you have system support for it
Paolo Di Tommaso
@pditommaso
Jan 21 14:54
So singularity 3.0 fixes this issue? Or only with a newer linux kernel?
verify with the singulary folks, AFAIK it's not an issue any more with recent singularity version >= 2.6
micans
@micans
Jan 21 14:55
@pditommaso I'm confused ... errorStrategy { task.attempt <= 2 ? 'retry' : 'ignore' } to me looks as if the errorStrategy will be either 'retry' or 'ignore', and never 'terminate'
Stephen Kelly
@stevekm
Jan 21 14:57

@photocyte

Is there a recommended way to get the current contained path from a queue channel? I noticed if you use the queue channel directly in the tag{} directive, it print out the path, but I want to get the path in the Groovy code part of the nextflow script for a regex.

I think it depends slightly on what kind of item you are operating on. For example, if you are trying to get the path to process output items, you can do it like this

Channel.from(1..10).set { input_ch }

process make_file {
    input:
    val(x) from input_ch

    output:
    file("${output_file}") into output_files

    script:
    output_file = "${x}.txt"
    """
    printf "${x}\t\$(date +"%Y-%m-%d-%H-%M-%S")\t\$(hostname)\n" > "${output_file}"
    """
}
output_files.map{ item ->
    println "${item}"
    println "${item.parent}"

    }

looks like this:

/gpfs/data/molecpathlab/development/nextflow-demos/slurm-queue/work/6e/5a7548a25bb136ed02b3050bd58910/1.txt
/gpfs/data/molecpathlab/development/nextflow-demos/slurm-queue/work/6e/5a7548a25bb136ed02b3050bd58910

However if you are reading the items in yourself there are a few other ways to do it as well, such as

Channel.from("main.nf").set { input_2_ch }
input_2_ch.map{ item ->
    def full_path = new File(item).getCanonicalPath()
    println "${full_path}"
}

output:

/gpfs/data/molecpathlab/development/nextflow-demos/slurm-queue/main.nf
Timothy R. Fallon
@photocyte
Jan 21 14:59
Thanks, the getCanonicalPath() looks like it might be the trick! Using groovy.util.FileNameByRegexFinder with the same input to the Channel.FromPath() also worked
Stephen Kelly
@stevekm
Jan 21 15:00
dont forget that you can also pass some types of regex (?) into the Channel.fromPath() directly e.g. Channel.fromPath('*.fastq'), etc
Timothy R. Fallon
@photocyte
Jan 21 15:01
Yes thats the case, I'm using that, but I wanted to put what the wildcard resolves to into a param variable
Tim Dudgeon
@tdudgeon
Jan 21 17:05
I'm wanting to check that my understanding of how to provide data files to Nextflow when using environments like Univa Grid Engine is correct. In general Nextflow expects there there to be a file system that is shared between where the workflow is run and the HPC environment where the the processes are executed. and there is no general mechanism for staging data. So if there is not a shared file system between these the workflow itself has to manage the transfer of the input files to the HPC file system and the transfer of the results back, using some mechanism like scp.
The use of S3 buckets seems to be the single exception to this where a mechanism specific to S3 is provided.
Is my understanding correct?
Stephen Kelly
@stevekm
Jan 21 18:47
is it possible to use log.info from within nextflow.config? I am getting errors like
No signature of method: groovy.util.ConfigObject.info() is applicable for argument types: (org.codehaus.groovy.runtime.GStringImpl) values: [aa/bddfef]
Possible solutions: find(), any(), any(groovy.lang.Closure), find(groovy.lang.Closure), find(groovy.lang.Closure), min(groovy.lang.Closure)