These are chat archives for nextflow-io/nextflow

22nd
Jun 2018
Lavanya Veeravalli
@veeravalli
Jun 22 2018 02:15

Hi I am having some weird problems with pbs cluster jobs. Jobs hang indefinitely and .nextflow logs are as below.

Jun-22 09:46:44.384 [Actor Thread 1] DEBUG nextflow.Session - <<< barrier arrive (process: gatk_hc)
Jun-22 09:46:44.386 [Actor Thread 105] INFO  nextflow.processor.TaskProcessor - [5f/041d55] Cached process > gatk_gt (region bfd859ad4 for sample WHH4965)
Jun-22 09:46:44.387 [Actor Thread 1] DEBUG nextflow.Session - <<< barrier arrive (process: gatk_gt)
Jun-22 09:46:44.416 [Actor Thread 2] INFO  nextflow.processor.TaskProcessor - [d7/98a190] Cached process > gtvcf_merge (WHB687)
Jun-22 09:46:44.416 [Actor Thread 59] INFO  nextflow.processor.TaskProcessor - [4c/34b4ea] Cached process > gtvcf_merge (WHH4967)
Jun-22 09:46:44.416 [Actor Thread 14] INFO  nextflow.processor.TaskProcessor - [54/2787eb] Cached process > gtvcf_merge (WHH4964)
Jun-22 09:46:44.416 [Actor Thread 43] INFO  nextflow.processor.TaskProcessor - [35/e987c2] Cached process > gtvcf_merge (WHB738)
Jun-22 09:46:44.416 [Actor Thread 98] INFO  nextflow.processor.TaskProcessor - [27/d82cc0] Cached process > gtvcf_merge (WHH4962)
Jun-22 09:46:44.416 [Actor Thread 10] INFO  nextflow.processor.TaskProcessor - [5a/01d9d8] Cached process > gtvcf_merge (WHB508)
Jun-22 09:46:44.416 [Actor Thread 102] INFO  nextflow.processor.TaskProcessor - [15/c4f961] Cached process > gtvcf_merge (WHB719)
Jun-22 09:46:44.422 [Actor Thread 12] INFO  nextflow.processor.TaskProcessor - [63/58126b] Cached process > gtvcf_merge (WHH4963)
Jun-22 09:46:44.422 [Actor Thread 25] INFO  nextflow.processor.TaskProcessor - [b2/a4d954] Cached process > gtvcf_merge (WHH4965)
Jun-22 09:46:44.426 [Actor Thread 6] INFO  nextflow.processor.TaskProcessor - [3a/b65603] Cached process > HardFilterAndMakeSitesOnlyVcf (WHB738)
Jun-22 09:46:44.426 [Actor Thread 75] INFO  nextflow.processor.TaskProcessor - [68/3ee543] Cached process > HardFilterAndMakeSitesOnlyVcf (WHH4967)
Jun-22 09:46:44.426 [Actor Thread 89] INFO  nextflow.processor.TaskProcessor - [62/50883b] Cached process > HardFilterAndMakeSitesOnlyVcf (WHB719)
Jun-22 09:46:44.426 [Actor Thread 48] INFO  nextflow.processor.TaskProcessor - [c0/8c2226] Cached process > HardFilterAndMakeSitesOnlyVcf (WHB687)
Jun-22 09:46:44.426 [Actor Thread 63] INFO  nextflow.processor.TaskProcessor - [67/bfe85a] Cached process > HardFilterAndMakeSitesOnlyVcf (WHH4965)
Jun-22 09:46:44.426 [Actor Thread 8] INFO  nextflow.processor.TaskProcessor - [6a/7eafa5] Cached process > HardFilterAndMakeSitesOnlyVcf (WHH4962)
Jun-22 09:46:44.426 [Actor Thread 56] INFO  nextflow.processor.TaskProcessor - [9d/da1ab3] Cached process > HardFilterAndMakeSitesOnlyVcf (WHH4964)
Jun-22 09:46:44.429 [Actor Thread 24] INFO  nextflow.processor.TaskProcessor - [64/db1bb9] Cached process > HardFilterAndMakeSitesOnlyVcf (WHB508)
Jun-22 09:46:44.429 [Actor Thread 101] INFO  nextflow.processor.TaskProcessor - [07/d513bf] Cached process > HardFilterAndMakeSitesOnlyVcf (WHH4963)
Jun-22 09:46:44.454 [Actor Thread 58] DEBUG nextflow.Session - <<< barrier arrive (process: gvcf_merge)
Jun-22 09:46:44.457 [Actor Thread 44] INFO  nextflow.processor.TaskProcessor - [91/502e85] Cached process > gtvcf_merge (WHH4961)
Jun-22 09:46:44.458 [Actor Thread 58] DEBUG nextflow.Session - <<< barrier arrive (process: gtvcf_merge)
Jun-22 09:46:44.461 [Actor Thread 22] INFO  nextflow.processor.TaskProcessor - [e2/9a594d] Cached process > HardFilterAndMakeSitesOnlyVcf (WHH4961)
Jun-22 09:46:44.463 [Actor Thread 58] DEBUG nextflow.Session - <<< barrier arrive (process: HardFilterAndMakeSitesOnlyVcf)
Jun-22 09:51:42.863 [Task monitor] DEBUG n.processor.TaskPollingMonitor - No more task to compute -- The following nodes are still active:
[process] VariantRecalibrator_SNPs
  status=ACTIVE
  port 0: (queue) OPEN; channel: -
  port 1: (value) -   ; channel: ref
  port 2: (value) -   ; channel: ref_fai
  port 3: (value) -   ; channel: ref_dict
  port 4: (value) -   ; channel: hapmap
  port 5: (value) -   ; channel: hapmap_idx
  port 6: (value) -   ; channel: omni
  port 7: (value) -   ; channel: omni_idx
  port 8: (value) -   ; channel: phase1_snps
  port 9: (value) -   ; channel: phase1_snps_idx
  port 10: (value) -   ; channel: dbsnp
  port 11: (value) -   ; channel: dbsnp_index
  port 12: (cntrl) OPEN; channel: $

[process] VariantRecalibrator_INDELs
  status=ACTIVE
  port 0: (queue) OPEN; channel: -
  port 1: (value) -   ; channel: ref
  port 2: (value) -   ; channel: ref_fai
  port 3: (value) -   ; channel: ref_dict
  port 4: (value) -   ; channel: dbsnp
  port 5: (value) -   ; channel: dbsnp_index
  port 6: (value) -   ; channel: golden_indel
  port 7: (value) -   ; channel: golden_indel_idx
  port 8: (cntrl) OPEN; channel: $

[process] ApplyVQSR
  status=ACTIVE
  port 0: (queue) OPEN; channel: -
  port 1: (cntrl) OPEN; channel: $

Please let me know if any more info required for troubleshooting? Thanks.

Paolo Di Tommaso
@pditommaso
Jun 22 2018 07:07
@caspargross please open an issue including the task trace and the snippet raising that issue
@veeravalli there's something odd in the first input of VariantRecalibrator_SNPs, how it's defined ?
Lavanya Veeravalli
@veeravalli
Jun 22 2018 07:27
Thanks Paolo.. Here are my snippets.. I have removed the commandlines to save some lines.
 process HardFilterAndMakeSitesOnlyVcf {
    tag "$sample_key"
    input:
        set sample_key, file("${sample_key}.raw.vcf.gz"), file("${sample_key}.raw.vcf.gz.tbi") from gtvcf_merge_ch1
    output:
        set sample_key, file("${sample_key}.sites_only_vcf_filename.vcf.gz"), file("${sample_key}.sites_only_vcf_filename.vcf.gz.tbi") into sites_only_vcf_ch1, sites_only_vcf_ch2, sites_only_vcf_ch3
    script:
    """
    gatk VariantFiltration \
        **********
    gatk  MakeSitesOnlyVcf \
        **********
    """
}
process VariantRecalibrator_SNPs {
    tag "sample $sample_key"
    input:
        set sample_key, file("${sample_key}.sites_only_vcf_filename.vcf.gz"), file("${sample_key}.sites_only_vcf_filename.vcf.gz.tbi") from sites_only_vcf_ch1
        file(ref)
        file(ref_fai)
        file(ref_dict)
        file(hapmap)
        file(hapmap_idx)
        file(omni)
        file(omni_idx)
        file(phase1_snps)
        file(phase1_snps_idx)
        file(dbsnp)
        file(dbsnp_index)
    output:
        set sample_key, file("${sample_key}.recalibrate_SNP.recal"), file("${sample_key}.recalibrate_SNP.recal.idx"), file("${sample_key}.recalibrate_SNP.tranches") into variantrecalibrator_SNP_ch
    script:
    """
    gatk VariantRecalibrator \
        **********
    """   
}
process VariantRecalibrator_INDELs {
    tag "sample $sample_key"
    input:
        set sample_key, file("${sample_key}.sites_only_vcf_filename.vcf.gz"), file("${sample_key}.sites_only_vcf_filename.vcf.gz.tbi") from sites_only_vcf_ch2
        file(ref)
        file(ref_fai)
        file(ref_dict)
        file(dbsnp)
        file(dbsnp_index)
        file(golden_indel)
        file(golden_indel_idx)
    output:
        set sample_key, file("${sample_key}.recalibrate_INDEL.recal"), file("${sample_key}.recalibrate_INDEL.recal.idx"), file("${sample_key}.recalibrate_INDEL.tranches") into variantrecalibrator_INDEL_ch
    script:
    """
    gatk VariantRecalibrator \
        **********
    """   
}
process ApplyVQSR {
    tag "sample $sample_key"
    publishDir "${params.publishdir}/${sample_key}", mode: 'copy'
    input:
        set sample_key, file("${sample_key}.sites_only_vcf_filename.vcf.gz"), file("${sample_key}.sites_only_vcf_filename.vcf.gz.tbi"), file("${sample_key}.recalibrate_INDEL.recal"),  file("${sample_key}.recalibrate_INDEL.recal.idx"), file("${sample_key}.recalibrate_INDEL.tranches"), file("${sample_key}.recalibrate_SNP.recal"), file("${sample_key}.recalibrate_SNP.recal.idx"), file("${sample_key}.recalibrate_SNP.tranches") from sites_only_vcf_ch3.join(variantrecalibrator_INDEL_ch).join(variantrecalibrator_SNP_ch)
    output:
         set sample_key, file("${sample_key}.recalibrated.vcf.gz"), file("${sample_key}.recalibrated.vcf.gz.tbi") into results_ch
    script:
    """
   gatk  ApplyVQSR \
        **********
    """
}
It hangs after completing HardFilterAndMakeSitesOnlyVcf step successfully.
Paolo Di Tommaso
@pditommaso
Jun 22 2018 07:35
ummm.. need more info
please rerun as
nextflow -trace nextflow run ... -resume
then open an issue uploading the .nextflow.log and possible the script code or at least a skeleton of it
Tobias Neumann
@t-neumann
Jun 22 2018 07:42

I'm trying to pass variable from one channel to another which I have been doing for quite some workflows already, but I'm getting:

ERROR ~ No such variable: lane

This is the (quite simple) process I'm trying to run:

process centrifuge {

    tag { lane }

    container = 'docker://obenauflab/virusintegration:latest'

    input:
    set val(lane), val(paired), file(reads) from fastqFilesFromBam

    output:
    set val(lane), val(paired), file "*centrifuge_report.tsv" into centrifugeChannel

    shell:

    if( paired == 'True' )
        '''
        centrifuge -x !{params.centrifugeIndex} -q -p !{task.cpus} -1 !{reads[0]} -2 !{reads[1]} --report-file !{lane}_centrifuge_report.tsv > /dev/null
        '''
    else
        '''
        centrifuge -x !{params.centrifugeIndex} -q -p !{task.cpus} -U !{reads} --report-file !{lane}_centrifuge_report.tsv > /dev/null
        '''

}

Variables must be passed, because when I change the output to:

file "*centrifuge_report.tsv" into centrifugeChannel

it works like a charm

Paolo Di Tommaso
@pditommaso
Jun 22 2018 07:59
are you saying that replacing
 output:
    set val(lane), val(paired), file "*centrifuge_report.tsv" into centrifugeChannel
with
 output:
    file "*centrifuge_report.tsv" into centrifugeChannel
it works?
Tobias Neumann
@t-neumann
Jun 22 2018 08:52
exactly
Paolo Di Tommaso
@pditommaso
Jun 22 2018 09:33
what if you use instead
 output:
    set val(lane), val(paired), file("*centrifuge_report.tsv") into centrifugeChannel
sendivogius
@sendivogius
Jun 22 2018 09:42
Hi, is it possible to hide warnings "The config file defines settings for an unknown process: foobar"?
Paolo Di Tommaso
@pditommaso
Jun 22 2018 09:43
yes, remove that setting :smile:
Tobias Neumann
@t-neumann
Jun 22 2018 09:43
@pditommaso it works. damn. can you quickly tell me why the thing without parenthesis fails?
Paolo Di Tommaso
@pditommaso
Jun 22 2018 09:44
if so it's a glitch in the parser :/
Tobias Neumann
@t-neumann
Jun 22 2018 09:44
ok - I'll take that ;)
btw I still have the problem of not being able to set containers for processes in the config file
Paolo Di Tommaso
@pditommaso
Jun 22 2018 09:45
stick on parenthesis when using set
Tobias Neumann
@t-neumann
Jun 22 2018 09:45
yes I will - thanks
It keeps telling me:
WARN: Unknown directive bamToFastq for process bamToFastq
WARN: Unknown directive centrifuge for process bamToFastq
this is the corresponding part from my config
process {

    publishDir = [path: './results', mode: 'copy', overwrite: 'true']

    errorStrategy = 'retry'
    maxRetries = 3
    maxForks = 20

    cpus = 1
    time = { 5.h * task.attempt }
    memory = { 10.GB * task.attempt }

    withName:bamToFastq {
        container = 'docker://obenauflab/virusintegration:latest'
    }
    withName:centrifuge {
        container = 'docker://obenauflab/virusintegration:latest'
    }
}
for now I set the containers explicitely in the process in main.nf
Paolo Di Tommaso
@pditommaso
Jun 22 2018 09:48
it looks fine
nextflow -version ?
Tobias Neumann
@t-neumann
Jun 22 2018 09:52
nextflow -version
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/tmp

      N E X T F L O W
      version 0.28.0 build 4779
      last modified 10-03-2018 12:13 UTC (13:13 CEST)
      cite doi:10.1038/nbt.3820
      http://nextflow.io
Paolo Di Tommaso
@pditommaso
Jun 22 2018 09:53
not even sure that withName: feature existed in that version .. upload to the latest one
Tobias Neumann
@t-neumann
Jun 22 2018 09:54
what's the best way to update?
Paolo Di Tommaso
@pditommaso
Jun 22 2018 09:54
nextflow -self-update
Tobias Neumann
@t-neumann
Jun 22 2018 09:55
looking good! Love how there's so much progress on Nextflow! Keep it up - thanks
Paolo Di Tommaso
@pditommaso
Jun 22 2018 09:56
:v:
marchoeppner
@marchoeppner
Jun 22 2018 10:48
hi, got a question about openstack vs nextflow. At the moment there is very nice support for AWS as a compute platform, allowing a pipeline to dynamically create resources in the cloud and deactivating them, based on the pipeline design (and set limits). Does anything like that exist for Openstack, at all?
the only solution I have at the moment is to launch an elasticluster, but that means blocking resources even tho they are not needed (temporarily)
marchoeppner
@marchoeppner
Jun 22 2018 10:56
uh, scope cloud, that looks interesting - seems new(?)
ah but that also refers to AWS, ...
Paolo Di Tommaso
@pditommaso
Jun 22 2018 11:00
Openstack is not currently a priority for us
marchoeppner
@marchoeppner
Jun 22 2018 11:00
probably worth looking at, I reckon - Elixir has a 60.000 Core Openstack cloud, looking to further expand that
EU open science cloud is also gonna be openstack, I think
Paolo Di Tommaso
@pditommaso
Jun 22 2018 11:00
alternatives:
  • use a Kubernetes cluster over Openstack
  • take in consideration a contribution to implement native support
marchoeppner
@marchoeppner
Jun 22 2018 11:01
yes, got stuck on kubernetes at the moment, difficult to wrap my head around how that would work in practice, ugh
Paolo Di Tommaso
@pditommaso
Jun 22 2018 11:01
I think Elixir is planning to use Kubernetes for final users
Alexander Peltzer
@apeltzer
Jun 22 2018 11:02
As far as I know too
The documentation on that is kind of vague right now (as its still work in progress), but so far there were a couple of comments to use Kubernetes in the large scale context in the end
Vladimir Kiselev
@wikiselev
Jun 22 2018 11:03
k8s community seems growing super fast and also it allows you to do much more in addition to running Nextflow
I see it as the best orchestration tool for OpenStack at the moment
Paolo Di Tommaso
@pditommaso
Jun 22 2018 11:04
Openstack is meant for VMs provisioning, not for job scheduling
marchoeppner
@marchoeppner
Jun 22 2018 11:10
indeed
well, you can use it like that, but it seems somewhat awkward anyway (i.e. elasticluster)
Kevin Sayers
@KevinSayers
Jun 22 2018 11:11
I agree that k8 is probably the way to go. Something else that might be worth looking at is htcondor + openstack. No personal experience but there seems to be ways to integrate the two.
Paolo Di Tommaso
@pditommaso
Jun 22 2018 11:13
folks, you want openstack native support ? implement it :wink:
open source is meant for that
marchoeppner
@marchoeppner
Jun 22 2018 11:25
ok, so re: kubernetes - essentially this is like a "traditional" cluster in the sense that I have finite resources existing in a pool; but using these resources then happens via these kubernetes pods, where pods are launched dependent on availability. so resources would dynamically scale like on a old-school HPC?
difference being that pods are "like" docker containers
?
Paolo Di Tommaso
@pditommaso
Jun 22 2018 11:27
a pod is essentially a container execution
as for the perspective of a NF user, theres' just another cluster backend
micans
@micans
Jun 22 2018 13:50
(1) In report.html 'Tasks' section the script column truncates multi-line commands. I understand the reasoning, as script sections can be large. Still, I just have a user asking about commands and parameters used. This tasks section seems ideal, this leads again to this question (2) I would like the feature to have this section exported stand-alone as a tab-separated file; it would help a lot with automated reporting on top of either a single nextflow run or a collection of them.
Paolo Di Tommaso
@pditommaso
Jun 22 2018 13:52
1) it should not be truncated, it should be scrollable
micans
@micans
Jun 22 2018 13:52
Mmm. It had a scroller and I scrolled ... maybe a display issue.
Paolo Di Tommaso
@pditommaso
Jun 22 2018 13:52
2) check nextflow log <run name> -f script
micans
@micans
Jun 22 2018 13:53
Ooooooooooo. Interesting! BRB!
Paolo Di Tommaso
@pditommaso
Jun 22 2018 13:53
:wink:
micans
@micans
Jun 22 2018 13:57
Love it, that's definitely worth a big :beer: if I make it to Barcelona :-)
Paolo Di Tommaso
@pditommaso
Jun 22 2018 13:57
can't wait
Phil Ewels
@ewels
Jun 22 2018 21:19
Not exactly scrollable - it is truncated but if you click it you should see the full command
(Sorry, late reply @micans )
From memory anyway