These are chat archives for nextflow-io/nextflow

26th
Mar 2019
Paolo Di Tommaso
@pditommaso
Mar 26 07:01
have a look at slashy stings
micans
@micans
Mar 26 10:13
trigger-warning shell hell:
    shell:
    '''
    perl -ne 'if (/^>(\\w+)(?:\\.\\d+)\\s+.*?gene:(\\w+)/){print "$1\\t$2\\n"}elsif(/^>(ERCC\\S+)/){print"$1\\t$1-gene\\n"}' \\
      !{fasta} > trans_gene.txt
(etc)
    '''
Rad Suchecki
@rsuchecki
Mar 26 10:22
:scream_cat:
micans
@micans
Mar 26 10:24
Yeah, I could have written this in a nicer way and stuck in a script, but oh well, life is brutal anyway.
Rad Suchecki
@rsuchecki
Mar 26 10:25
was thinking that too, but would be nice to have this in the main script
micans
@micans
Mar 26 10:26
I started coding in 94, perl was big then :-) I still like it a lot as a superset of sed and awk. It's part of unix now.
but I do feel like I need to apologise :-P
Rad Suchecki
@rsuchecki
Mar 26 10:27
yes, I usually get into this problem when stuffing too much awk into he pipeline
   shell: 
   $/
erl -ne 'if (/^>(\\w+)(?:\\.\\d+)\\s+.*?gene:(\\w+)/){print "$1\\t$2\\n"}elsif(/^>(ERCC\\S+)/){print"$1\\t$1-gene\\n"}' \\
      !{fasta} > trans_gene.txt
  /$
ahhh perhaps not
micans
@micans
Mar 26 10:29
Appending the ERCC transcripts to the cdna file made this bit of code become just so long to be awkward. Although we have a colleague who wrote a legendary bam demultiplexer in a perl one-liner, which formatted took up about 50 lines.
that's interesting!
Rad Suchecki
@rsuchecki
Mar 26 10:29
should have escaped the $s now - this was just a musing on http://groovy-lang.org/syntax.html#_string_summary_table
micans
@micans
Mar 26 10:30
wow, dollar slashy. And no need to escape backslash it seems.
some tests needed later ...
Rad Suchecki
@rsuchecki
Mar 26 10:32
:thumbsup:
Tried a few things, linking the dollar slashy!
micans
@micans
Mar 26 11:07
The memory directive, when specified in the process itself, is documented as memory '2 GB'. Is this correct? I was expecting memory '2.GB'.
Paolo Di Tommaso
@pditommaso
Mar 26 11:08
'2 GB' - OR - 2.GB
micans
@micans
Mar 26 11:08
Cool, thanks!
so memory 2.GB without quotes, correct?
Paolo Di Tommaso
@pditommaso
Mar 26 11:08
yes
micans
@micans
Mar 26 11:08
:+1:
Paolo Di Tommaso
@pditommaso
Mar 26 11:08
the trick is the dot
that transforms the number into a mem value
micans
@micans
Mar 26 11:09
I prefer 2.GB .... '2 GB' looks so stringy
the first very much looks like a mem value indeed
Calling for presentations and tutorials for Nextflow Camp in September :tada: :tada:
Jonathan Manning
@pinin4fjords
Mar 26 13:54
If I run the same workflow multiple times, from different working directories on the same cluster, do they interfere? I mean, does Nextflow keep track of job IDs, or does it parse the cluster status?
micans
@micans
Mar 26 15:26
As far as I understand the question, you can definitely do this and nothing interferes. Everything lives in its own namespace. But this answer is primarily motivated by the feeling (with some experience) that this is how it must work rather than knowledge of internals or documentation.
Jonathan Manning
@pinin4fjords
Mar 26 15:32
I was trying to explain an issue I was having when I scaled up to deploy a pipeline in production, with multiple instances of the same pipeline running concurrently, but I /think/ I found the culprit elsewhere - thanks.
Chelsea Sawyer
@csawye01
Mar 26 16:00
Has anyone had the error with Fastq_screen in their pipeline of Aligner bowtie2 not exectable at 'bowtie2', please adjust configuration.? The module fastq_screen from the config is loading with bowtie2.
Sven F.
@sven1103
Mar 26 17:52
@stevekm @KochTobi have a look at nf-core/tools#288 and https://github.com/qbicsoftware/nextflow-logger-service. Still in its early stage, but I give the latter priority for our facility ;)
Vladimir Kiselev
@wikiselev
Mar 26 20:13
how to make a full content of a directory (including all its files and and all subdirectories and files in them) available to a process?
is it simply .collect()?
I can’t figure out what regexpr to use to make subdirectories visible
Jonathan Manning
@pinin4fjords
Mar 26 20:18

@wikiselev Channel.fromPath() with type='dir'? That will work if the directory exists before running the workflow.

.collect() will work to gather all channel elements into a single input for a process, if that's what you mean

Vladimir Kiselev
@wikiselev
Mar 26 20:23
Nice, many thanks!
Jonathan Manning
@pinin4fjords
Mar 26 20:25
No worries :-)
Vladimir Kiselev
@wikiselev
Mar 26 20:25
the only problem, NF creates an extra directory and put the directory of interest in that directory...
Jonathan Manning
@pinin4fjords
Mar 26 20:27
Yep, a symlink, right? Is that a problem?
Vladimir Kiselev
@wikiselev
Mar 26 20:28
Oh, I see
now it’s clear, thanks again!
Jonathan Manning
@pinin4fjords
Mar 26 20:28
np
Vladimir Kiselev
@wikiselev
Mar 26 20:28
shows how experienced I am with NF!
no, wait!
I am doing this:
ch_course_files = Channel.fromPath('course_files', type: 'dir')
and what I see then in the work directory is this:
▶ ls -la work/38/9cc58a5aad096dc68c570f14ba985c/               
total 56
drwxr-xr-x  12 vk6  1307   384 26 Mar 20:27 .
drwxr-xr-x   4 vk6  1307   128 26 Mar 20:27 ..
-rw-r--r--   1 vk6  1307     0 26 Mar 20:27 .command.begin
-rw-r--r--   1 vk6  1307   266 26 Mar 20:27 .command.err
-rw-r--r--   1 vk6  1307   266 26 Mar 20:27 .command.log
-rw-r--r--   1 vk6  1307     0 26 Mar 20:27 .command.out
-rw-r--r--   1 vk6  1307  1905 26 Mar 20:27 .command.run
-rw-r--r--   1 vk6  1307    96 26 Mar 20:27 .command.sh
-rw-r--r--   1 vk6  1307  2433 26 Mar 20:27 .command.stub
-rw-r--r--   1 vk6  1307   137 26 Mar 20:27 .command.trace
-rw-r--r--   1 vk6  1307     1 26 Mar 20:27 .exitcode
drwxr-xr-x   3 vk6  1307    96 26 Mar 20:27 _bookdown_files
and the symlink is inside of _bookdown_files:
▶ ls -la work/38/9cc58a5aad096dc68c570f14ba985c/_bookdown_files
total 0
drwxr-xr-x   3 vk6  1307   96 26 Mar 20:27 .
drwxr-xr-x  12 vk6  1307  384 26 Mar 20:27 ..
lrwxr-xr-x   1 vk6  1307   49 26 Mar 20:27 course_files -> /Users/vk6/cellgeni/scRNA.seq.course/course_files
I have no idea where _bookdown_files comes from!
Jonathan Manning
@pinin4fjords
Mar 26 20:32
Hmm. What does the process definition look like?
Vladimir Kiselev
@wikiselev
Mar 26 20:33
process foo {
  input: 
    file fs from ch_course_files
  script:
  """
  Rscript -e "bookdown::render_book('index.html', 'bookdown::gitbook')"
  “”"
}
another problem is that I need the files in the directory to be available in the root of the work directory! The tool is not working otherwise ;-)
Complicated stuff
Jonathan Manning
@pinin4fjords
Mar 26 20:37
I have no idea why your symlink ends up in the bookdown dir. But you can do whatever additional symlinking you need before your Rscript call
Vladimir Kiselev
@wikiselev
Mar 26 20:41
ok, I see, thanks a lot!
Jonathan Manning
@pinin4fjords
Mar 26 20:42
no worries
Vladimir Kiselev
@wikiselev
Mar 26 20:54
Adding ln -s course_files/* . before the Rscript solved the problem
_bookdown_files is created by bookdown on failure and it moves everything in there...
Jonathan Manning
@pinin4fjords
Mar 26 21:30
awesome