These are chat archives for nextflow-io/nextflow

13th
Mar 2018
Paolo Di Tommaso
@pditommaso
Mar 13 2018 08:13
@boulund you cannot have optional input, but you can simple have a file to which you give a empty semantic
params.filter_seqs = 'NO_FILTER'
filter_file = file(params.filter_seqs)

process foo {
  input:
  file filter_file 

  script:
  def filter_argument = filter_file.name != 'NO_FILTER' ? "--filter ${params.filter_seqs}" : ''

  """
  echo $filter_argument
  """
}
I think it's a reasonable workaround
Shellfishgene
@Shellfishgene
Mar 13 2018 10:29
Will commands that go in the afterScript directive be inside the bash script that is submitted to the cluster scheduler? Our cluster requires we add export SCRATCH="/scratch/"`echo $PBS_JOBID | cut -f2 -d\: to the script for deleting files in the temp dir on the node. I'm not sure if this will work correctly if nextflow puts this in the command.sh but not in the command.run file that is actually submitted.
Paolo Di Tommaso
@pditommaso
Mar 13 2018 10:30
ugly !
doesn't your cluster define the TMPDIR variable for that ?
Fredrik Boulund
@boulund
Mar 13 2018 10:36
Thanks @pditommaso (and thanks @ewels for letting me know there was a reply)
Paolo Di Tommaso
@pditommaso
Mar 13 2018 10:37
don't you receive notifications ? :)
Fredrik Boulund
@boulund
Mar 13 2018 10:37
:D If I keep gitter open then I probably would
Shellfishgene
@Shellfishgene
Mar 13 2018 10:38
It does have $TMPDIR, but the manual says it's not cleaned up automatically after the job.
Fredrik Boulund
@boulund
Mar 13 2018 10:38
I like your solution, it is clear and hopefully fairly easy to read and understand in a few weeks from now :)
Paolo Di Tommaso
@pditommaso
Mar 13 2018 10:40
It does have $TMPDIR, but the manual says it's not cleaned up automatically after the job.
but NF does for your if you set process.scratch = true ;)
if you want to use the scratch path you should be able to add in the nextflow config
env.SCRATCH='/scratch/$(echo $PBS_JOBID | cut -f2 -d\:)'
env.TMPDIR='$SCRATCH'
process.scratch = true
note the use of ' instead of "
Shellfishgene
@Shellfishgene
Mar 13 2018 10:46
Ah, cool. If nf cleans it up anyway, env.TMPDIR='$TMPDIR' should be enough I guess.
Paolo Di Tommaso
@pditommaso
Mar 13 2018 10:47
umm no this env.TMPDIR='$TMPDIR' is useless
Shellfishgene
@Shellfishgene
Mar 13 2018 10:49
Duh, indeed. So nf uses $TMPDIR anyway, so just process.scratch = true is all I need?
Paolo Di Tommaso
@pditommaso
Mar 13 2018 10:50
you need to process.scratch = true to instruct NF to cleanup local temporary files
that by default are placed in the directory given by TMPDIR if specified or /tmp otherwise
if you want to store those files in as you sysadmins are suggesting use the snippet above
otherwise just process.scratch = true
Shellfishgene
@Shellfishgene
Mar 13 2018 10:54
Ok, thanks. Acutally I don't get why there is $TMPDIR and $SCRATCH on our cluster, but that's not a nf problem...
Shellfishgene
@Shellfishgene
Mar 13 2018 12:50
In my config I have memory = { 32.GB * task.attempt } and in the process -Xmx${task.memory.toGiga()}g, but I get the error Cannot invoke method toGiga() on null object. Is there an error in the config somewhere or is it something else?
Paolo Di Tommaso
@pditommaso
Mar 13 2018 13:04
umm, it should be process.memory = { 32.GB * task.attempt }
Shellfishgene
@Shellfishgene
Mar 13 2018 13:12
it's inside a $markDuplicates { ... } block, that should work, no?
Shellfishgene
@Shellfishgene
Mar 13 2018 13:22
Forgot to specify config file on the command line...
Paolo Di Tommaso
@pditommaso
Mar 13 2018 13:22
ah
Vladimir Kiselev
@wikiselev
Mar 13 2018 16:25

Hello, when I send my file names to the channel this way:

    output: 
        set val(sample), file('*.cram') into cram_files

and then input them to the other process this way:

    input:
        set val(sample), file(cram) from cram_files

and then call ${cram} in the script, it somehow inserts all of my file names instead creating a separate job for each file. So looks like this (I have 3 input cram files):

#!/bin/bash -euo pipefail
samtools fastq \
    -N \
    -@ 4 \
    -1 24919_1#2.cram 24919_1#3.cram 24919_1#4.cram_R1_001.fastq.gz

Am I doing something wrong?

Paolo Di Tommaso
@pditommaso
Mar 13 2018 16:32
that's expected
I guess, that a transponse should do what you are expecting
    input:
        set val(sample), file(cram) from cram_files.transponse()
if not ping again
Vladimir Kiselev
@wikiselev
Mar 13 2018 16:35
Aa, ok, thanks. But in this case does the length of sample has to be the same as the number of cram files?
Paolo Di Tommaso
@pditommaso
Mar 13 2018 16:35
sample is a single value, no?
Vladimir Kiselev
@wikiselev
Mar 13 2018 16:35
yes
Paolo Di Tommaso
@pditommaso
Mar 13 2018 16:36
should work if so
Vladimir Kiselev
@wikiselev
Mar 13 2018 16:36
I am trying to pass a sample ID and all associated files to the channel
ok, will try now
Basically an equivalent of a job array, as I see it...
Paolo Di Tommaso
@pditommaso
Mar 13 2018 16:37
more or less ..
Vladimir Kiselev
@wikiselev
Mar 13 2018 16:40
ERROR ~ No signature of method: groovyx.gpars.dataflow.DataflowQueue.transponse() is applicable for argument types: () values: []
Possible solutions: transpose(), transpose(java.util.Map)
Paolo Di Tommaso
@pditommaso
Mar 13 2018 16:40
sorry my fault, there's a typo, it's transpose
Vladimir Kiselev
@wikiselev
Mar 13 2018 16:40
ohhh
I see
Paolo Di Tommaso
@pditommaso
Mar 13 2018 16:40
:)
Vladimir Kiselev
@wikiselev
Mar 13 2018 16:40
should have checked ;-)
hahaha, I copied it from the docs and there is another typo in this word there ;-)
The traspose operator transforms a channel in such...
didn’t check again ;-)
Paolo Di Tommaso
@pditommaso
Mar 13 2018 16:44
:facepalm: !!
Vladimir Kiselev
@wikiselev
Mar 13 2018 16:44
:smile:
it worked, many thanks @pditommaso ! You can’t simply transpose!
Paolo Di Tommaso
@pditommaso
Mar 13 2018 16:48
that's improve my karma :)
Vladimir Kiselev
@wikiselev
Mar 13 2018 16:48
:smile:
Vladimir Kiselev
@wikiselev
Mar 13 2018 22:20
@pditommaso is it possible to add an iterator to a process, so that it will have a unique value inside each job submitted by that process? Maybe there is an operator for that, I didn’t find it though...
Paolo Di Tommaso
@pditommaso
Mar 13 2018 22:53
yep, repeaters !
Mike Smoot
@mes5k
Mar 13 2018 23:03
@wikiselev I interpreted your question differently. If you want something like a with_index operator that adds an index to each element of a channel, then the last time I checked, nothing like that was available. However, it's easy enough to code your own:
def ind = 0

your_channel.map{ [ind++, it] }
Paolo Di Tommaso
@pditommaso
Mar 13 2018 23:04
do you mean I should read the questions sometimes? :satisfied:
Mike Smoot
@mes5k
Mar 13 2018 23:12
It's late where you are! You're forgiven. :)