These are chat archives for nextflow-io/nextflow

16th
Feb 2018
Phil Ewels
@ewels
Feb 16 2018 08:53
Hi @pditommaso - the other day we were talking about switching .collect() to .tolist(). This was for when we have conditional processes that use when which funnel into MultiQC (if they don't run, MultiQC just waits forever).
And now nextflow hangs before the MultiQC process always, even when the conditional processes do run.. :laughing:
Paolo Di Tommaso
@pditommaso
Feb 16 2018 08:54
ooops
Phil Ewels
@ewels
Feb 16 2018 08:54
which is weird, as I'm pretty sure that we used to use .tolist() before .collect() existed..
Paolo Di Tommaso
@pditommaso
Feb 16 2018 08:55
remind me what was the problem with collect
if it doesn't run because there's only one bam file, then MultiQC doesn't run as it expects sample_correlation_results output
Paolo Di Tommaso
@pditommaso
Feb 16 2018 08:58
ah yes, if that doesn't run collect returns empty channel that prevent multiqc to be executed
I think the best workaround is to use toList only for that process
Paolo Di Tommaso
@pditommaso
Feb 16 2018 09:39
@Tintest your snippet should work
is the bcl2fastq process executed at all ?
Tintest
@Tintest
Feb 16 2018 09:45
@pditommaso Yes the process works, maybe it is because i'm redirecting my ouptut in a logfile with " &> bcl2fastq.log"
Anyway I checked in the logfile what I wanted to see and everything works fine ! For now ... :) Thanks !
Paolo Di Tommaso
@pditommaso
Feb 16 2018 09:49
ahh, sure I didn't see the redirection at the end of you snippet
without that it works :)
Tintest
@Tintest
Feb 16 2018 09:52
Yeah I should have think a bit more before asking my question, sorry for bothering you :) (But unfortunately for you it will continue, if I continue to use nextflow :D)
Paolo Di Tommaso
@pditommaso
Feb 16 2018 09:53
you are welcome
Phil Ewels
@ewels
Feb 16 2018 10:00
ok cool - I'll try changing just that one process then and see what happens
hmm, still hanging before MultiQC
input:
    file multiqc_config
    file (fastqc:'fastqc/*') from fastqc_results.collect()
    file ('trimgalore/*') from trimgalore_results.collect()
    file ('alignment/*') from alignment_logs.collect()
    file ('rseqc/*') from rseqc_results.collect()
    file ('rseqc/*') from genebody_coverage_results.collect()
    file ('preseq/*') from preseq_results.collect()
    file ('dupradar/*') from dupradar_results.collect()
    file ('featureCounts/*') from featureCounts_logs.collect()
    file ('featureCounts_biotype/*') from featureCounts_biotype.collect()
    file ('stringtie/stringtie_log*') from stringtie_log.collect()
    file ('sample_correlation_results/*') from sample_correlation_results.toList()
    file ('software_versions/*') from software_versions_yaml.collect()
Paolo Di Tommaso
@pditommaso
Feb 16 2018 10:06
are you sure that's not another change causing the hang ?
Phil Ewels
@ewels
Feb 16 2018 10:06
pretty sure - I switch sample_correlation_results.toList() back to sample_correlation_results.collect() and MultiQC runs again
(running with three samples, so sample_correlation_results isn't empty)
Paolo Di Tommaso
@pditommaso
Feb 16 2018 10:11
can you try to isolate the problem and open issue on GH so we can discuss there?
Phil Ewels
@ewels
Feb 16 2018 10:18
will try :+1:
Phil Ewels
@ewels
Feb 16 2018 10:39
ah damn it, nearly finished writing my nice GitHub issue with a minimal example and everything, and here toList() does seem to work as you say it should
Paolo Di Tommaso
@pditommaso
Feb 16 2018 10:48
bug self healing :satisfied:
Phil Ewels
@ewels
Feb 16 2018 12:25
something like that..!
weird, I'm going to end up rewriting the entire pipeline at this rate
Phil Ewels
@ewels
Feb 16 2018 12:49
ok, I went the other way - hacking away at the pipeline that already had the bug and managed to come up with a minimal example
nextflow-io/nextflow#611
Paolo Di Tommaso
@pditommaso
Feb 16 2018 12:51
I will have a look asap
:+1:
Phil Ewels
@ewels
Feb 16 2018 12:58
Thanks!
Vladimir Kiselev
@wikiselev
Feb 16 2018 17:18

Hi Paolo, what can be a problem here? I define a parameter:

params.reads = 'cram/*.cram’

and then invoke it in the channel creation:

read_files_cram = Channel.fromPath( params.reads )

then I use this channel in a process as an input:

    input:
    set val(name), file(reads) from read_files_cram

NF throws the following warning:

Input tuple does not match input set cardinality declared by process

and does not seem to get the file names...

Paolo Di Tommaso
@pditommaso
Feb 16 2018 17:19
hey
I think you wanted to use Channel.fromFilePairs instead of Channel.fromPath
Vladimir Kiselev
@wikiselev
Feb 16 2018 17:20
hmm, I can’t find Channel.fromFilePairs in the docs
but will try now, thanks
Vladimir Kiselev
@wikiselev
Feb 16 2018 17:21
but I don’t have pairs though
it’s just a bunch of cram files
Paolo Di Tommaso
@pditommaso
Feb 16 2018 17:21
oops
so forget that
Vladimir Kiselev
@wikiselev
Feb 16 2018 17:21
ok
Paolo Di Tommaso
@pditommaso
Feb 16 2018 17:22
the problem is that you declared the input as a pair
    input:
    set val(name), file(reads) from read_files_cram
hence it's expecting a name one or more files
use instead
    input:
    file(reads) from read_files_cram
Vladimir Kiselev
@wikiselev
Feb 16 2018 17:22
Aa, ok, I see, I was reusing Phil’s code… Thanks a lot!
Paolo Di Tommaso
@pditommaso
Feb 16 2018 17:23
too much copy and paste ;)
Vladimir Kiselev
@wikiselev
Feb 16 2018 17:23
and have a nice weekend! You are a star
;-)
Paolo Di Tommaso
@pditommaso
Feb 16 2018 17:23
you too !
:)
Vladimir Kiselev
@wikiselev
Feb 16 2018 17:35
Sorry, last question for today. Is it OK to use minicube locally to test k8s executor?
Or doesn't make sense?
Paolo Di Tommaso
@pditommaso
Feb 16 2018 17:35
it should work
I'm planning to write some docs next week
Vladimir Kiselev
@wikiselev
Feb 16 2018 17:37
Cool, thanks! Nice weekend again!
Paolo Di Tommaso
@pditommaso
Feb 16 2018 17:37
same there !
(btw if you are using mac, docker for mac includes a K8s cluster ..)
Vladimir Kiselev
@wikiselev
Feb 16 2018 17:39
Oo, thanks, I didn't know that
Paolo Di Tommaso
@pditommaso
Feb 16 2018 17:40
much easier than minkube and works like charm
Vladimir Kiselev
@wikiselev
Feb 16 2018 17:46
Nice!
Paolo Di Tommaso
@pditommaso
Feb 16 2018 17:47
indeed!
Shawn Rynearson
@srynobio
Feb 16 2018 20:09

I have a strange error so I thought I would ask you guys to see if it's something you've seen before:

I'm launching a job to aws-batch and I get the following:

$ nextflow run aws.pipeline.nf -c aws.2.config -w s3://mybucket
N E X T F L O W  ~  version 0.26.0
Launching `ucgd.pipeline.nf` [scruffy_church] - revision: e420bfdbfd
Exception in thread "Thread-1" java.lang.IllegalArgumentException: Key cannot be empty
        at com.amazonaws.util.ValidationUtils.assertStringNotEmpty(ValidationUtils.java:89)
        at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1347)
        at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1259)
        at com.upplication.s3fs.AmazonS3Client.getObject(AmazonS3Client.java:91)
        at com.upplication.s3fs.util.S3ObjectSummaryLookup.getS3Object(S3ObjectSummaryLookup.java:197)
        at com.upplication.s3fs.util.S3ObjectSummaryLookup.lookup(S3ObjectSummaryLookup.java:88)
        at com.upplication.s3fs.S3FileSystemProvider.readAttributes(S3FileSystemProvider.java:636)
        at java.nio.file.Files.readAttributes(Files.java:1737)
        at java.nio.file.FileTreeWalker.getAttributes(FileTreeWalker.java:219)
        at java.nio.file.FileTreeWalker.visit(FileTreeWalker.java:276)
        at java.nio.file.FileTreeWalker.walk(FileTreeWalker.java:322)
        at java.nio.file.Files.walkFileTree(Files.java:2662)
        at nextflow.file.FileHelper.visitFiles(FileHelper.groovy:720)
        at nextflow.file.FileHelper$visitFiles$1.call(Unknown Source)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:149)
        at nextflow.Channel$_pathImpl_closure3.doCall(Channel.groovy:269)
        at nextflow.Channel$_pathImpl_closure3.doCall(Channel.groovy)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:294)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
        at groovy.lang.Closure.call(Closure.java:414)
        at groovy.lang.Closure.call(Closure.java:408)
        at groovy.lang.Closure.run(Closure.java:495)
        at java.lang.Thread.run(Thread.java:745)
[warm up] executor > awsbatch
[84/85b7e2] Submitted process > fastqc

It writes command files to the bucket [log/sh]

Also when I run a process that just prints a version it works and the log file is readable in my bucket.