by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Thomas Dyar
    @tom-dyar
    quick question: does the SNAP alignment in avocado run in parallel over a single sample within a single input BAM file? Wondering if using avocado for alignment / preprocessing will help turnaround time for our per-run qc pipeline?
    Allen Day
    @allenday
    I'm getting this error from avocado: "java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot be cast to org.bdgenomics.formats.avro.NucleotideContigFragment" using config basic.properties
    Allen Day
    @allenday
    I see from the parquet metadata that the reference chromosomes are of class NucleotideContigFragment.
    so I interpret this to mean that the reads, which are of class AlignmentRecord, are being converted at some point to GenericData$Record, which sounds erroneously generic.
    anyone else seen this? any idea, @fnothaft ?
    Andrew Chen
    @andrewmchen
    hi allen! what command are you running to get this?
    Allen Day
    @allenday
    I am updated to head on git for avocado
    problem shows up if I invoke like this:
    /path/to/avocado-submit /path/to/MT.bam.adam /path/to/human_g1k_v37.fasta.adam /path/to/out-avocado /path/to/avocado-sample-configs/basic.properties
    however, if I use unconverted data, like:
    /path/to/avocado-submit /path/to/MT.bam /path/to/human_g1k_v37.fasta /path/to/out-avocado /path/to/avocado-sample-configs/basic.properties
    Andrew Chen
    @andrewmchen
    have you seen the response in the ADAM gitter? I think it may be because you're using an old reference .adam file.
    Allen Day
    @allenday
    ok, making some progress on this. it looks like you're right, the latest pull from adam repo and rebuilding old .adam files at least gets me past that error.
    thx
    Luca Pireddu
    @ilveroluca
    Hello people. Is anyone having problems running avocado on large-ish datasets?
    though I've had success with small input, I haven't been able to get it to successfully complete a job on anything larger than about 20 GB
    Erin Jerri Pangilinan
    @erinjerri
    does anyone actually use apache drill over apache spark for anything in big data genomics? i was thinking not. there’s a training in mapR next week on it (very introductory), familiar with spark but not w/ drill, would like to know folks’ thoughts here on what actual practitioners use and prefer and why
    Erin Jerri Pangilinan
    @erinjerri
    ah nm, seems just like another add-on
    Khaled Nasri
    @_Nasri81_twitter
    hello
    shibuvp
    @shibuvp
    can you share avocado-submit command?
    salimbakker
    @salimbakker
    17/07/06 16:51:00 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.16.2.121:32910 (size: 30.3 KB, free: 366.3 MB)
    17/07/06 16:51:00 INFO SparkContext: Created broadcast 0 from newAPIHadoopFile at ADAMContext.scala:376
    17/07/06 16:51:00 WARN BiallelicGenotyper: Input RDD is not persisted. Performance may be degraded.
    Command body threw exception:
    java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaMirrors$JavaMirror;
    17/07/06 16:51:00 INFO BiallelicGenotyper: Overall Duration: 3.41 secs
    Exception in thread "main" java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaMirrors$JavaMirror;
    at org.bdgenomics.avocado.genotyping.DiscoverVariants$.variantsInRdd(DiscoverVariants.scala:83)
    at org.bdgenomics.avocado.genotyping.DiscoverVariants
    KaTeX parse error: Unexpected character: '$' at position 7: anonfun̲$apply$1.apply: anonfun$apply$1.apply(DiscoverVariants.scala:56)
            at org.bdgenomics.avocado.genotyping.DiscoverVariants
    anonfun$apply$1.apply(DiscoverVariants.scala:56)
    at scala.Option.fold(Option.scala:158)
    at org.apache.spark.rdd.Timer.time(Timer.scala:48)
    at org.bdgenomics.avocado.genotyping.DiscoverVariants$.apply(DiscoverVariants.scala:54)
    at org.bdgenomics.avocado.genotyping.BiallelicGenotyper$.discoverAndCall(BiallelicGenotyper.scala:153)
    at org.bdgenomics.avocado.cli.BiallelicGenotyper
    KaTeX parse error: Unexpected character: '$' at position 7: anonfun̲$4.apply(Biall: anonfun$4.apply(BiallelicGenotyper.scala:228)
            at org.bdgenomics.avocado.cli.BiallelicGenotyper
    anonfun$4.apply(BiallelicGenotyper.scala:228)
    at scala.Option.fold(Option.scala:158)
    at org.bdgenomics.avocado.cli.BiallelicGenotyper.run(BiallelicGenotyper.scala:235)
    at org.bdgenomics.utils.cli.BDGSparkCommand$class.run(BDGCommand.scala:55)
    at org.bdgenomics.avocado.cli.BiallelicGenotyper.run(BiallelicGenotyper.scala:196)
    at org.bdgenomics.avocado.cli.AvocadoMain
    KaTeX parse error: Unexpected character: '$' at position 7: anonfun̲$run$3.apply(A: anonfun$run$3.apply(AvocadoMain.scala:75)
            at org.bdgenomics.avocado.cli.AvocadoMain
    anonfun$run$3.apply(AvocadoMain.scala:74)
    at scala.Option.fold(Option.scala:158)
    at org.bdgenomics.avocado.cli.AvocadoMain.run(AvocadoMain.scala:74)
    at org.bdgenomics.avocado.cli.AvocadoMain$.main(AvocadoMain.scala:26)
    at org.bdgenomics.avocado.cli.AvocadoMain.main(AvocadoMain.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    17/07/06 16:51:00 INFO SparkContext: Invoking stop() from shutdown hook
    17/07/06 16:51:00 INFO ServerConnector: Stopped ServerConnector@84bbff{HTTP/1.1}{0.0.0.0:4040}
    17/07/06 16:51:00 INFO ContextHandler: Stopped o.s.j.s.ServletContextHandler@6db66836{/stages/stage/kill,null,UNAVAILABLE}
    17/07/06 16:51:00 INFO ContextHandler: Stopped o.s.j.s.ServletContextHandler@3574e198{/jobs/job/kill,null,UNAVAILABLE}
    17/07/06 16:51:00 INFO ContextHandler: Stopped o.s.j.s.ServletContextHandler@27e0f2f5{/api,null,UNAVAILABLE}
    17/07/06 16:51:00 INFO ContextHandler: Stopped o.s.j.s.ServletContextHandler@9cd25ff{/,null,UNAVAILABLE}
    17/07/06 16:51:00 INFO ContextHandler: Stopped o.s.j.s.ServletContextHandler@69f63d95{/static,null,UNAVAILABLE}
    17/07/06 16:51:00 INFO ContextHandler: Stopped o.s.j.s.ServletContextHandler@660e9100{/executors/threadDump/json,null,UNAVAILABLE}
    17/07/06 16:51:00 INFO ContextHandler: Stopped o.s.j.s.ServletContextHandler@6928f576{/executors/threadDump,null,UNAVAILABL
    Peter van 't Hof
    @ffinfo

    @ffinfo
    I have some questions about this file https://github.com/bigdatagenomics/avocado/blob/master/avocado-core/src/main/scala/org/bdgenomics/avocado/genotyping/DiscoverVariants.scala

    Why are all methods here private? This way I can't use it as a library inside a full in-memory pipeline
    maybe I'm missing a special api file? ;)