Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
    Dean Wampler
    Hi, Martin. Thanks for attending and sorry for the hassle.
    Did you pass the -p 9001:9001 argument?
    docker run -p 9001:9001 andypetrella/spark-notebook:0.7.0-scala-2.11.8-spark-2.1.0-hadoop-2.7.2-with-hive
    Hongyuan Yuan
    Hi, I have downloaded spark-notebook-0.7.0-scala-2.11.8-spark-2.1.0-hadoop-2.7.2-with-hive, however, when I am in this folder and run bin/spark-notebook, it asks for my permission. when I use sudo, it says command sudo: bin/spark-notebook: command not found
    I have verified my java version as: java version "1.8.0_121"
    any advice on how to open the spark-notebook? thanks
    I think I just solved it...thanks
    I have uploaded the "JustEnoughScalaForSpark.snb" from the GitHub location, onto the Notebook. However, I don't see it added to the notebooks list. I chose "click here" to upload the notebook. Any help?
    for people having issues seeing the notebook when using Docker, such as @EinserViech_twitter: you might need to refer to the IP for your docker-machine, e.g. rather than localhost:9001
    In addition, if you're using docker, the files will be downloaded inside the docker container instead of locally to your computer. you can see them by executing bash inside the running container. do this in a separate terminal: docker exec -it <id of your container> /bin/bash - you'll get a command prompt. then ls and you should see the data/shakespeare directory
    Hi - I have a question to scala and Spark, not sure if this is the right forum. If not please direct me to the correct forum. In the Scala class now we talked about Dataframes and how to use them with Scala. I would like to know how I can execute Spark SQL queries in parallel in a Spark Streaming application. Should I use Scala Futures to submit each dataframe aggregation and will those be executed concurrently?
    *typo - I have a question regarding Scala and Spark
    Yu Shen
    I'm having the same problem as @mannit . I'm using Ubuntu 16.04, with java SDK:
    java -version
    java version "1.8.0_131"
    Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
    Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
    no docker
    Following the instruction below:

    The click here is a link. Click it, then navigate to where you downloaded the tutorial GitHub repository. Find and select notebooks/JustEnoughScalaForSpark.snb.

    A new line in the UI is added with "JustEnoughScalaForSpark.snb" and an "Upload" button on the right-hand side, as shown in Figure 1:

    This step produced the expected outcome.
    The next:

    Figure 1: Before Uploading the Notebook
    I've highlighted the "click here" link that you used and the new line that was added for the tutorial notebook.

    Click the "Upload" button.

    Now the line is moved towards the bottom of the page and the buttons on the right-hand side are different.

    This step failed to make the notebook appearing at the bottom of the page.
    I then also tried the alternative:
    Yu Shen
    I found there were error messages:

    Play server process ID is 2435
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/home/yubrshen/programming/scala/spark-notebook-0.7.0-scala-2.11.8-spark-2.1.0-hadoop-2.7.2-with-hive/lib/ch.qos.logback.logback-classic-1.1.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/home/yubrshen/programming/scala/spark-notebook-0.7.0-scala-2.11.8-spark-2.1.0-hadoop-2.7.2-with-hive/lib/org.slf4j.slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
    [info] play - Application started (Prod)
    [info] play - Listening for HTTP on /0:0:0:0:0:0:0:0:9001
    [DEBUG] [06/07/2017 22:36:04.809] [New I/O worker #1] [EventStream] StandardOutLogger started
    [DEBUG] [06/07/2017 22:36:04.955] [New I/O worker #1] [EventStream(akka://NotebookServer)] logger log1-Slf4jLogger started
    [DEBUG] [06/07/2017 22:36:04.956] [New I/O worker #1] [EventStream(akka://NotebookServer)] Default Loggers started
    [debug] application - Notebooks directory in the config is referring ./notebooks. Does it exist? false
    [info] application - Notebooks dir is ../notebooks [at /home/yubrshen/programming/scala/spark-notebook-0.7.0-scala-2.11.8-spark-2.1.0-hadoop-2.7.2-with-hive/../notebooks]
    [info] application - Notebook directory is: /home/yubrshen/programming/scala/notebooks
    [debug] application - Profiles file is : ../conf/profiles
    [debug] application - Clusters file is : ../conf/clusters
    [error] a.a.OneForOneStrategy - ../conf/profiles (No such file or directory)
    akka.actor.ActorInitializationException: exception during creation
    at akka.actor.ActorInitializationException$.apply(Actor.scala:166) ~[com.typesafe.akka.akka-actor_2.11-2.3.11.jar:na]
    at akka.actor.ActorCell.create(ActorCell.scala:596) ~[com.typesafe.akka.akka-actor_2.11-2.3.11.jar:na]
    at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:456) ~[com.typesafe.akka.akka-actor_2.11-2.3.11.jar:na]
    at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) ~[com.typesafe.akka.akka-actor_2.11-2.3.11.jar:na]
    at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263) ~[com.typesafe.akka.akka-actor_2.11-2.3.11.jar:na]
    Caused by: java.io.FileNotFoundException: ../conf/profiles (No such file or directory)
    at java.io.FileInputStream.open0(Native Method) ~[na:1.8.0_131]
    at java.io.FileInputStream.open(FileInputStream.java:195) ~[na:1.8.0_131]
    at java.io.FileInputStream.<init>(FileInputStream.java:138) ~[na:1.8.0_131]
    at scala.io.Source$.fromFile(Source.scala:91) ~[org.scala-lang.scala-library-2.11.8.jar:na]
    at scala.io.Source$.fromFile(Source.scala:76) ~[org.scala-lang.scala-library-2.11.8.jar:na]
    [debug] application - DASH → /
    [error] application -

    ! @749h2j09d - Internal server error, for (GET) [/profiles?_=1496900165975] ->

    play.api.Application$$anon$1: Execution exception[[AskTimeoutException: Recipient[Actor[akka://NotebookServer/user/$a#1733845533]] had already been terminated.]]
    at play.api.Application$class.handleError(Application.scala:296) ~[com.typesafe.play.play_2.11-2.3.10.jar:2.3.10]

    @yubrshen - Can you try restarting your notebook and doing all the above steps "Upload" etc and then going to the local link manually in the browser: http://localhost:9001/notebooks/JustEnoughScalaForSpark.snb#
    Give it a couple of mins.. it should refresh.. it worked for me finally when I manually typed in the url above
    Yu Shen
    I observed that there was not a single notebook shown at the bottom, not even those at the notebooks directory of the Spark Notebook distribution
    @mannit you meant restart the notebook server then use the explicit url to load the notebook after the manual "Upload"?
    kind of hacky, but that worked..
    Yu Shen
    Thanks! It worked by simply clicking the url even without restarting nor re-doing "upload".
    So for those runing into the issue of not seeing uploaded notebook, just use explicit url to load it.
    Great! :)
    Dean Wampler
    Sorry for the difficults several of you have had. Spark Notebook is not the best experience. Most the exceptions you noted @yubrshen don’t actually cause problems, but they shouldn’t occur anyway.
    @mannit, check out the Spark with Scala group, https://gitter.im/spark-scala/Lobby
    I am trying to install the Spark Notebook on my Mac and running into problems. My initial issue when running 'bin/spark-notebook', I received a permission error. My solution was to change the permissions of the file with a 'chmod 711'. Upon opening the localhost:9001 link, I received errors and have not found a fix. I am running Java 8, on OS Sierra.
    Dean Wampler
    Does the terminal output show the URL with port 9001? If it’s 9000 (some versions of Spark Notebook), you have to use that port.
    If that’s not the issue, can you provide more information about the errors you’re seeing
    User error, my apologies. As it turns out, my Java 8 was not recognized in my command terminal.
    Dean Wampler
    Glad you figured it out.
    I resorted to using homebrew to install it instead. Just got notebook loaded, thanks!
    Dean Wampler
    Good luck!
    @deanwampler I am working on implementing Lambda Architecture with Spark 2.0, Scala 2.11.8, Cassandra and Kafka and wanted to know if there are any recommended links to look into especially for implementing the reconciliation layer between streaming and batch. Can you please suggest some?
    Dean Wampler
    I don’t have any recommendations, other than the fact you can share code between batch and streaming. You might ask the https://gitter.im/spark-scala/Lobby channel
    I am trying to run wordFileNameOnes.count but it's throwing the exception...
    Caused by: java.util.regex.PatternSyntaxException: Unexpected internal error near index 1
    I am running this in spark-shell launched from windows.. is something wrong with the regex split("""\W+""") given in the code when running from windows?
    Dean Wampler
    That shouldn’t happen even on Windows. Can you provide a stack trace? Did you make any modifications that you think might have caused the issue?
    hi Dean.. there was issue with the separator which was being used to get the file name from the full file path... it's resolved now and getting results as expected...
    Fernando Margueirat
    I see there has not been any new messages in almost a year, but I hope someone still checks this from time to time. I am running the Jupyter notebook as per https://github.com/deanwampler/JustEnoughScalaForSpark but on the cell's output it does not print the types of the variables and it is sometimes hard to follow the examples without that information.
    i.e. instead of printing
    (Int, Int) = (1,2)
    it prints
    Does anyone knows if this is some kind of setting I can change or it is due to me having a different version of the Docker container?