by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 00:08
    scala-steward synchronize #565
  • May 28 22:25

    alexarchambault on master

    Update case-app, case-app-annot… (compare)

  • May 28 22:25
    alexarchambault closed #572
  • May 28 20:07
    scala-steward opened #573
  • May 28 20:07
    scala-steward opened #572
  • May 27 06:48
    gertjandemulder commented #568
  • May 26 09:38

    alexarchambault on gh-pages

    Update website (compare)

  • May 26 09:00
    alexarchambault closed #548
  • May 26 09:00
    alexarchambault commented #548
  • May 26 09:00
    alexarchambault closed #551
  • May 26 09:00
    alexarchambault commented #551
  • May 26 09:00
    alexarchambault closed #562
  • May 26 09:00
    alexarchambault commented #562
  • May 26 09:00

    alexarchambault on master

    Update jsoniter-scala-core, ...… (compare)

  • May 26 09:00
    alexarchambault closed #567
  • May 26 08:59
    alexarchambault closed #558
  • May 26 08:59
    alexarchambault commented #558
  • May 26 08:59
    alexarchambault closed #564
  • May 26 08:59
    alexarchambault commented #564
  • May 26 08:59
    alexarchambault closed #570
Wojtek Pituła
@Krever
yeah, seems like so ;/ I just went through docs and can find anything about out of process persistance
Alfonso Roa
@alfonsorr
i was thinking to launch a scala script with the imports but that idea is much better
Henry
@hygt
hello, this is more related to ammonite-spark but I'm trying to run the REPL and then hopefully notebooks on my company cluster, it's running Hortonworks/Cloudera HDP 2.6.5
creating the Spark sessions fails
Exception in thread "main" java.lang.NoClassDefFoundError: scala/MatchError
    at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:833)
    at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
Caused by: java.lang.ClassNotFoundException: scala.MatchError
    at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 2 more
this looks like some typical binary compatibility conflict, but I have no idea how to track it down
Sören Brunk
@sbrunk
@hygt do the Scala versions of your Spark installation and Ammonite match? Most spark distributions are still on Scala 2.11 (Spark 2.4 supports Scala 2.12 but it's not the default) while Ammonite has dropped for 2.11 support after version 1.6.7
Henry
@hygt
yes, I know
I've rebuilt Spark and it seems to work now
I'm not using the provided Spark binaries, exactly because of Scala 2.11
now I get a connectivity error (from the executor to the client?) but I guess that's unrelated, let's do more digging...
Henry
@hygt
so I've tried many things but I always have issues if I load local Spark jars
unsetting SPARK_HOME gets me a little further
so maybe I should publish my patched version of Spark to our local artifactory and let AmmoniteSparkSession fetch the jars
Alfonso Roa
@alfonsorr
can you show the imports and the way that you create the SparkSession'
and be sure to use a 2.11 kernel and spark version 2.3.2
Henry
@hygt
I'm using scala 2.12, spark 2.4.4 and ammonite-spark 0.9.0
I was trying something like that
import $ivy.`sh.almond::ammonite-spark:0.9.0`
import ammonite.ops._

val sparkHome = sys.env("SPARK_HOME")
val sparkJars = ls ! Path(sparkHome) / 'jars
interp.load.cp(sparkJars)

@

import org.apache.spark.sql._

val spark =
  AmmoniteSparkSession
    .builder()
    .loadConf("my.conf")
    .progressBars()
    .master("yarn")
    .getOrCreate()
but I always ran in some issue or another
I published my patched Spark to our artifactory, unset SPARK_HOME and it works much better
Henry
@hygt
ok now I have another issue, shell works but notebooks not so much :grimacing:
Creating SparkSession
Error initializing SparkContext.
java.io.IOException: No FileSystem for scheme: jar
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2660)
 ...
using the same configuration file
Henry
@hygt
OK, I found a workaround: alexarchambault/ammonite-spark#118
now I have some binary compatibility issue with Jackson even though all jars in my session look aligned :grimacing:
com.fasterxml.jackson.databind.JsonMappingException: Scala module 2.10.3 requires Jackson Databind version >= 2.10.0 and < 2.11.0
  com.fasterxml.jackson.module.scala.JacksonModule.setupModule(JacksonModule.scala:61)
...
org.apache.spark.sql.almondinternals.NotebookSparkSessionBuilder.getOrCreate(NotebookSparkSessionBuilder.scala:62)
and my jars:
List(
  "/jackson-module-paranamer-2.10.3.jar",
  "/jackson-module-scala_2.12-2.10.3.jar",
  "/jackson-databind-2.10.3.jar",
  "/jackson-annotations-2.10.3.jar",
  "/jackson-core-2.10.3.jar",
  "/jackson-core-2.10.1.jar",
  "/jackson-datatype-jdk8-2.10.1.jar",
  "/jackson-databind-2.10.1.jar",
  "/jackson-datatype-jsr310-2.10.1.jar",
  "/jackson-annotations-2.10.1.jar",
  "/jackson-annotations-2.10.3.jar",
  "/jackson-databind-2.10.3.jar",
  "/jackson-module-paranamer-2.10.3.jar",
  "/jackson-module-scala_2.12-2.10.3.jar",
  "/jackson-core-2.10.3.jar"
)
Henry
@hygt
ha I think I understand why, jackson is in the almond launcher assembly! ok I know how to fix this...
sorry about all my rambling this is slowly driving me crazy
Wojtek Pituła
@Krever
Does any of you guys maybe have a way of retrieving notebook filename being run? Not that this is specific to almond but most of the solutions on the internet use python, so want to ask you first.
Andrew
@sheerluck

@Krever

%%javascript
var kernel = IPython.notebook.kernel;
var thename = window.document.getElementById("notebook_name").innerHTML;
var command = "theNotebook = " + "'"+thename+"'";
kernel.execute(command);

Wojtek Pituła
@Krever
Awesome, thanks!
I wanted to hide it in my scala lib, but will figure sth out...
Sören Brunk
@sbrunk
@Krever you totally can. Here’s the almond version:
kernel.publish.js("""
  var thename = window.document.getElementById("notebook_name").innerHTML;
  var command = 'val theNotebook = "'+thename+'"';
  Jupyter.notebook.kernel.execute(command);
""")
image.png
Note that it only works in the classic notebook, not in JupyterLab due to security restrictions
Sören Brunk
@sbrunk
But you can put it into a library by adding scala-kernel-api as a provided dependency as described in the docs. https://almond.sh/docs/api-access-instances#from-a-library
@hygt Does it work for you now? Anyway, thanks for sharing what you’ve found trying to solve these issues.
Wojtek Pituła
@Krever
@sbrunk thanks, didnt know such things are possible. I will have to figure out sth that works in Jupyter lab, but now I have all the pieces on almond side.
Henry
@hygt
@sbrunk yes I've gotten to a point where it works. I've built the Almond launcher with coursier's --assembly-rule exclude-pattern ... to get rid of Jackson and Json4s classes. No more binary compatibility issues
my setup is tricky because our codebase shares way too much code between our services and Spark jobs
we can get around bin compat issues with sbt-assembly shading rules
but I was a bit tired of moving 100+ MB fat JARs around just to do some exploration with the Spark shell
Henry
@hygt
also our data scientists would rather use notebooks :smiley:
Pedro Larroy
@larroy
is almond working? I'm trying to run it in jupyter notebook and finding all kinds of problems
First i bumped in this issue: almond-sh/almond#508
now seems the kernel it's hanging
is there a way to debug it?
I separated my statements into smaller chunks and now seems to work
weird
Wojtek Pituła
@Krever
how would you approach rendering basic grapgh diagram? just some nodes and edges