Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Nov 22 11:05
    Quafadas commented #897
  • Nov 22 11:05
    Quafadas commented #897
  • Nov 19 05:30
    UniverseBorder opened #897
  • Nov 15 00:59
    NicolasRouquette synchronize #896
  • Nov 14 20:43
    NicolasRouquette synchronize #896
  • Nov 13 20:13
    NicolasRouquette opened #896
  • Nov 13 20:12
    NicolasRouquette opened #895
  • Nov 09 09:04
    scala-steward opened #894
  • Nov 03 15:04
    scala-steward closed #892
  • Nov 03 15:04
    scala-steward commented #892
  • Nov 03 15:04
    scala-steward opened #893
  • Nov 03 07:04
    scala-steward closed #876
  • Nov 03 07:04
    scala-steward commented #876
  • Nov 03 07:04
    scala-steward opened #892
  • Nov 01 22:55
    qbouvet closed #729
  • Nov 01 22:55
    qbouvet commented #729
  • Nov 01 21:16
    scala-steward opened #891
  • Oct 22 14:06
    scala-steward opened #890
  • Oct 21 14:03
    scala-steward closed #883
  • Oct 21 14:03
    scala-steward commented #883
Sören Brunk
@sbrunk
We're using a somewhat similar method (a bit more involved i.e. using a multi stage docker build) to create an almond docker image without having published almond artifacts (for snapshot builds).
https://github.com/almond-sh/almond/blob/master/Dockerfile
Alfonso Roa
@alfonsorr
a question related maybe more with coursier, is there a way to download some dependencies (to create a container) and access them to use in almond? the objective is to create the docker with some dependencies and not have to download them with ivy every time
Sören Brunk
@sbrunk
@alfonsorr adding a RUN coursier fetch <lib>in the docker build should do the trick to prepopulate the coursier cache. almond uses the coursier API through Ammonite so it should find the cached artifacts in the running container then.
Wojtek Pituła
@Krever
@sbrunk thanks!
Wojtek Pituła
@Krever
I have one more question, hopefully, a simple yes/no one but only partially related to almond. Is it possible to store notebook state? I have a use case in which I need to wait in the middle of notebook for a potentially very long time, so I would like to stop it and after I get completion event revive it from where it stopped.
ammonite session persistance comes to my mind, but Im curious if anyone actually tried that
Alfonso Roa
@alfonsorr
@sbrunk great, thanks!
Alfonso Roa
@alfonsorr
i added also coursier fetch --sources <lib> to skip all the downloads in the notebook, it works great thanks!
Sören Brunk
@sbrunk
@alfonsorr I just remembed that you can even take it one step further. You can execute a notebook programatically using nbconvertin the docker build which will do a coursier fetch as well. The advantage is that it will stay in sync with your import $ivy ... statements in the notebook.
I did that for my Scala Days talk about almond because I used a notebook as live slides. https://github.com/sbrunk/scaladays-2019/blob/master/scripts/jupyter.sh#L87
@Krever I think saved ammonite sessions don't persist through different runs of ammonite but I'm not sure
Wojtek Pituła
@Krever
yeah, seems like so ;/ I just went through docs and can find anything about out of process persistance
Alfonso Roa
@alfonsorr
i was thinking to launch a scala script with the imports but that idea is much better
Henry
@hygt
hello, this is more related to ammonite-spark but I'm trying to run the REPL and then hopefully notebooks on my company cluster, it's running Hortonworks/Cloudera HDP 2.6.5
creating the Spark sessions fails
Exception in thread "main" java.lang.NoClassDefFoundError: scala/MatchError
    at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:833)
    at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
Caused by: java.lang.ClassNotFoundException: scala.MatchError
    at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 2 more
this looks like some typical binary compatibility conflict, but I have no idea how to track it down
Sören Brunk
@sbrunk
@hygt do the Scala versions of your Spark installation and Ammonite match? Most spark distributions are still on Scala 2.11 (Spark 2.4 supports Scala 2.12 but it's not the default) while Ammonite has dropped for 2.11 support after version 1.6.7
Henry
@hygt
yes, I know
I've rebuilt Spark and it seems to work now
I'm not using the provided Spark binaries, exactly because of Scala 2.11
now I get a connectivity error (from the executor to the client?) but I guess that's unrelated, let's do more digging...
Henry
@hygt
so I've tried many things but I always have issues if I load local Spark jars
unsetting SPARK_HOME gets me a little further
so maybe I should publish my patched version of Spark to our local artifactory and let AmmoniteSparkSession fetch the jars
Alfonso Roa
@alfonsorr
can you show the imports and the way that you create the SparkSession'
and be sure to use a 2.11 kernel and spark version 2.3.2
Henry
@hygt
I'm using scala 2.12, spark 2.4.4 and ammonite-spark 0.9.0
I was trying something like that
import $ivy.`sh.almond::ammonite-spark:0.9.0`
import ammonite.ops._

val sparkHome = sys.env("SPARK_HOME")
val sparkJars = ls ! Path(sparkHome) / 'jars
interp.load.cp(sparkJars)

@

import org.apache.spark.sql._

val spark =
  AmmoniteSparkSession
    .builder()
    .loadConf("my.conf")
    .progressBars()
    .master("yarn")
    .getOrCreate()
but I always ran in some issue or another
I published my patched Spark to our artifactory, unset SPARK_HOME and it works much better
Henry
@hygt
ok now I have another issue, shell works but notebooks not so much :grimacing:
Creating SparkSession
Error initializing SparkContext.
java.io.IOException: No FileSystem for scheme: jar
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2660)
 ...
using the same configuration file
Henry
@hygt
OK, I found a workaround: alexarchambault/ammonite-spark#118
now I have some binary compatibility issue with Jackson even though all jars in my session look aligned :grimacing:
com.fasterxml.jackson.databind.JsonMappingException: Scala module 2.10.3 requires Jackson Databind version >= 2.10.0 and < 2.11.0
  com.fasterxml.jackson.module.scala.JacksonModule.setupModule(JacksonModule.scala:61)
...
org.apache.spark.sql.almondinternals.NotebookSparkSessionBuilder.getOrCreate(NotebookSparkSessionBuilder.scala:62)
and my jars:
List(
  "/jackson-module-paranamer-2.10.3.jar",
  "/jackson-module-scala_2.12-2.10.3.jar",
  "/jackson-databind-2.10.3.jar",
  "/jackson-annotations-2.10.3.jar",
  "/jackson-core-2.10.3.jar",
  "/jackson-core-2.10.1.jar",
  "/jackson-datatype-jdk8-2.10.1.jar",
  "/jackson-databind-2.10.1.jar",
  "/jackson-datatype-jsr310-2.10.1.jar",
  "/jackson-annotations-2.10.1.jar",
  "/jackson-annotations-2.10.3.jar",
  "/jackson-databind-2.10.3.jar",
  "/jackson-module-paranamer-2.10.3.jar",
  "/jackson-module-scala_2.12-2.10.3.jar",
  "/jackson-core-2.10.3.jar"
)
Henry
@hygt
ha I think I understand why, jackson is in the almond launcher assembly! ok I know how to fix this...
sorry about all my rambling this is slowly driving me crazy
Wojtek Pituła
@Krever
Does any of you guys maybe have a way of retrieving notebook filename being run? Not that this is specific to almond but most of the solutions on the internet use python, so want to ask you first.
Andrew
@sheerluck

@Krever

%%javascript
var kernel = IPython.notebook.kernel;
var thename = window.document.getElementById("notebook_name").innerHTML;
var command = "theNotebook = " + "'"+thename+"'";
kernel.execute(command);

Wojtek Pituła
@Krever
Awesome, thanks!
I wanted to hide it in my scala lib, but will figure sth out...
Sören Brunk
@sbrunk
@Krever you totally can. Here’s the almond version:
kernel.publish.js("""
  var thename = window.document.getElementById("notebook_name").innerHTML;
  var command = 'val theNotebook = "'+thename+'"';
  Jupyter.notebook.kernel.execute(command);
""")
image.png
Note that it only works in the classic notebook, not in JupyterLab due to security restrictions
Sören Brunk
@sbrunk
But you can put it into a library by adding scala-kernel-api as a provided dependency as described in the docs. https://almond.sh/docs/api-access-instances#from-a-library
@hygt Does it work for you now? Anyway, thanks for sharing what you’ve found trying to solve these issues.
Wojtek Pituła
@Krever
@sbrunk thanks, didnt know such things are possible. I will have to figure out sth that works in Jupyter lab, but now I have all the pieces on almond side.
Henry
@hygt
@sbrunk yes I've gotten to a point where it works. I've built the Almond launcher with coursier's --assembly-rule exclude-pattern ... to get rid of Jackson and Json4s classes. No more binary compatibility issues