These are chat archives for nextflow-io/nextflow

26th
Apr 2019
Rad Suchecki
@rsuchecki
Apr 26 00:23

@phiweger you could output sample IDs along with both types of data then in the mapping process you could do something like this

when:
  assemblyID == readsID

This goes hand-in-hand with using assembliesChannel.combine(nanoporeChannel) so that all combinations are evaluated

Ethan Bensman
@ebenz99
Apr 26 01:44
image.png
Does anyone know why the above code would result in 5 newline characters being printed into a.txt, and not the numbers from num?
Rad Suchecki
@rsuchecki
Apr 26 01:47
using single quotes means that $x is a bash variable not NF
hint: rather than pasting images, wrap your code in triple back-ticks `
so you can try
"""
echo $x > a.txt
"""
Using > over >> as these will be 5 distinct a.txt files so no point appending
Rad Suchecki
@rsuchecki
Apr 26 01:53
if you want to stick to single quotes around your script block, use !{x} in place of $x https://www.nextflow.io/docs/latest/process.html#shell
Anna Syme
@AnnaSyme
Apr 26 03:03
A question about reading input files. We can't just specify at the start of the NF script myfile ="data/somefile.txt" can we? It needs to be params.myfile = "data/somefile.txt" and then my-input = file(params.myfile), and then accessed in the codeblock by $my-input ?
Anna Syme
@AnnaSyme
Apr 26 03:49
(Is this making a params object?)
Rad Suchecki
@rsuchecki
Apr 26 04:09
I don't think this should matter, that is myfile ="data/somefile.txt" creates a string, which is equivalent to storing the same string in params (as a value assigned to key myfile)
as for creating a file object, it depends where you want to use it
Anna Syme
@AnnaSyme
Apr 26 04:59
Thanks, I see, I would need to make the file object and then read it. (But makes sense to use params, I just wasn't clear on what it was doing).
Rad Suchecki
@rsuchecki
Apr 26 05:13
I guess you need to create file obj if accessing content outside a process.
As for accessing, probably not $my_input but simply my_input otherwise string interpolation will get in the way.
Pierre Lindenbaum
@lindenb
Apr 26 07:29
Hi all, is there any way to capture short text values as val or saving to text files is the only way to retrieve the information. For example if I want to get the sample-name dans the read-length from a bam:
process test {
input:
    file bam from input1
output:    
    set bam,sample_name,read_len into output2
script:

"""

## get sample-name

samtools view -H ${bam} | grep '@RG' | tr "\\t" "\n" | grep 'SM' | cut -d ':' -f 2   > ???

## get read-lenth

samtools view jeter.bam | head -n 1 | awk '{print length(\$10)}' > ???

"""
}
(hum cannot edit.. should be samtools view ${bam} of course...
Paolo Di Tommaso
@pditommaso
Apr 26 08:25
other than saving the output to a file, you can capture the std output using
output:    
    set file(bam),stdout(read_len) into output2
but won't be able to handle multiple values, therefore just save to files
Pierre Lindenbaum
@lindenb
Apr 26 08:28
@pditommaso ok thanks !
Paolo Di Tommaso
@pditommaso
Apr 26 08:28
:+1:
Francesco Strozzi
@fstrozzi
Apr 26 10:43
hi guys, does anybody have experienced a segmentation fault error when instantiating a NF job ? As per our tests, this happens when a NF job is doing a collect on a channel with a lot of files inside (the typical gather step in a pipeline). In this case we have more than 30k files to be staged in the job and we narrowed down the problem to the bash limits, since the list of files is written explicitly in the nxf_stage() function inside the .command.run file. An obvious workaround would be to set ulimit -s unlimited at the top of the .command.run file…
(side note, we are on AWS Batch, but it’s unrelated)
Adrian Viehweger
@phiweger
Apr 26 11:16
thanks @rsuchecki
Tobias "Tobi" Schraink
@tobsecret
Apr 26 19:18
since the last update nextflow asks for root permission, is that expected?
I now have to sudo nextflow
and the console is not working anymore either
(base) tobias@tobias-iMac:~$ sudo nextflow console
(base) tobias@tobias-iMac:~$ No protocol specified
Exception in thread "main" java.awt.AWTError: Can't connect to X11 window server using ':0' as the value of the DISPLAY variable.
    at sun.awt.X11GraphicsEnvironment.initDisplay(Native Method)
    at sun.awt.X11GraphicsEnvironment.access$200(X11GraphicsEnvironment.java:65)
    at sun.awt.X11GraphicsEnvironment$1.run(X11GraphicsEnvironment.java:115)
    at java.security.AccessController.doPrivileged(Native Method)
    at sun.awt.X11GraphicsEnvironment.<clinit>(X11GraphicsEnvironment.java:74)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:264)
    at java.awt.GraphicsEnvironment.createGE(GraphicsEnvironment.java:103)
    at java.awt.GraphicsEnvironment.getLocalGraphicsEnvironment(GraphicsEnvironment.java:82)
    at sun.awt.X11.XToolkit.<clinit>(XToolkit.java:126)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:264)
    at java.awt.Toolkit$2.run(Toolkit.java:860)
    at java.awt.Toolkit$2.run(Toolkit.java:855)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.awt.Toolkit.getDefaultToolkit(Toolkit.java:854)
    at javax.swing.UIManager.getSystemLookAndFeelClassName(UIManager.java:611)
    at javax.swing.UIManager$getSystemLookAndFeelClassName.call(Unknown Source)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:119)
    at nextflow.ui.console.Nextflow.main(Nextflow.groovy:243)
Tobias "Tobi" Schraink
@tobsecret
Apr 26 20:14

Also having issues with .fromFilePairs for some reason:

$ ls test/1/*_{1,2}.fastq.gz
test/1/349_1.fastq.gz test/1/349_2.fastq.gz
test/1/525_1.fastq.gz test/1/525_2.fastq.gz

My nextflow script is:

Channel
    .fromFilePairs("test/1/*_{1,2}.fastq.gz")
    .println()
Channel
    .from(['foo', 'bar'])
    .println()

And will print:

<normal nextflow warmup messages>
foo
bar
Austin Keller
@austinkeller
Apr 26 21:18
I'm sure this is a really basic question. I'm just getting started with nextflow. But if I have a .txt file containing all of my filenames, how can I use these as paths? The docs make it clear how to use a glob with Channel.fromPath(). How can I get the same effect using a .txt file with explicitly defined paths? Thanks!
I'm as far as in_paths = Channel.fromPath( params.paths.txt).splitText()
Steve Marshall
@stevemmarshall
Apr 26 21:25
I'm running some AWS Batch jobs and I want to override my memory and vcpu allocations that have been already defined in my job def.. do I simply do this? process {
container = "job-definition://Batch-Genomics-Dev"
queue = "BatchJobQueue"
memory = "32G"
vcpus = "16"
}
Austin Keller
@austinkeller
Apr 26 21:34
@stevemmarshall I haven't tried setting different cpu/memory for batch but this may help: https://github.com/FredHutch/reproducible-workflows/blob/master/nextflow/interleave-fastq-pairs/interleave-fastq-pairs.nf it looks like you may want to define it as "cpus" instead of "vcpus" but otherwise your syntax seems like it matches
Steve Marshall
@stevemmarshall
Apr 26 21:36
thanks @austinkeller
I think I was able to alter those parameters from what I did above... then for a specific process I'll set a small number of cpus as you mentioned
micans
@micans
Apr 26 21:40

@austinkeller this works for me:

Channel.fromPath("file.txt")
   .splitText()
   .map{ file(it.trim()) }
   .set{ch_file}

Make sure that the file names in 'file.txt' are fully qualified paths.

Austin Keller
@austinkeller
Apr 26 21:42
perfect, thanks @micans ! Yeah I want to be able to handle either s3:// or local filepaths and so needed a proper solution using the Path() interface. And good to know that it needs a full path :thumbsup:
micans
@micans
Apr 26 21:44
yw!
micans
@micans
Apr 26 21:52
@tobsecret I can replicate what you see, but the channel still works and I can hook up a process with it; it seems just the println() (and view()) that don't show. Oh, when I add -ansi-log false to the command line they show up actually.
Tobias "Tobi" Schraink
@tobsecret
Apr 26 22:57
Yeah, I eventually figured out that the channel works, as well.
thanks for replicating! The workflow is working otherwise, just had a problem we wanted to trouble shoot - hence the need for println. It ended up being nextflow-io/nextflow#768, so we took the solution Paolo is suggesting there.