Discuss Scala APIs and idioms for Spark Programming
Hi Dean. I'll be taking your tutorial Tuesday. I followed your instructions to prep for the class. All worked well until I clicked on Upload on the Spark Notebook page to load the JustEnoughScalaForSpark.snb. The GUI responded without error but the notebook was not listed near the bottom of the page as your instructions show. The terminal session where I started the notebook is just showing
[INFO] SAVE → JustEnoughScalaForSpark.snb [INFO] save at path JustEnoughScalaForSpark.snb [INFO] Loading notebook at path JustEnoughScalaForSpark.snb
This is on a Mac running Yosemite 10.10.5
What did I miss?
Hi Dave. Yesterday, I added a few trouble shooting details and images to that section of the README. If you cloned it before then, pull the update (or just look at on the GitHub page.)
Another thing to try is to copy the snb file to the “notebooks” directory under the Spark Notebook root directory. I don’t think you’ll need to restart for it to become visible.
Copying the notebook did the trick (I downloaded everything just a few minutes ago). Thanks for the quick response! See you Tuesday.
Great! I just added that idea to the troubleshooting tips.
I have installed spark-notebook with your readme instructions. The UI is not coming up in any of the browsers. Getting err_connection_refused in chrome. I remove the default docker machine and re-installed. Still no help. Following is the o/p on quick start terminal:
$ docker run -p 9001:9001 andypetrella/spark-notebook:0.7.0-scala-2.11.8-spark- 2.1.0-hadoop-2.7.2-with-hive time="2017-04-13T15:59:36+05:30" level=warning msg="Unable to use system certifi cate pool: crypto/x509: system root pool is not available on Windows" Unable to find image 'andypetrella/spark-notebook:0.7.0-scala-2.11.8-spark-2.1.0 -hadoop-2.7.2-with-hive' locally 0.7.0-scala-2.11.8-spark-2.1.0-hadoop-2.7.2-with-hive: Pulling from andypetrella /spark-notebook fdd5d7827f33: Pull complete a3ed95caeb02: Pull complete a93eb074af52: Pull complete 0c8bdcb3bc61: Pull complete 68ca236e9585: Pull complete 7de4152022ca: Pull complete 64467858f09b: Pull complete Digest: sha256:aab33132c751dbc1f26de81ea29a44482764719cf95ce28ad8087731ebd5c2c8 Status: Downloaded newer image for andypetrella/spark-notebook:0.7.0-scala-2.11. 8-spark-2.1.0-hadoop-2.7.2-with-hive Play server process ID is 1 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/docker/lib/ch.qos.logback.logback-classic -1.1.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/docker/lib/org.slf4j.slf4j-log4j12-1.7.16 .jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorSta ticBinder] [info] play - Application started (Prod) [info] play - Listening for HTTP on /0.0.0.0:9001
Please provide your suggestions. Thanks in advance.
So you can't open localhost:9001?
File a bug with the github project. Try to provide as much information as you can. Thanks.
sure thank you
Hi, I will attend the session on tuesday. I did the full installation last week and went OK until the end with the "println("Hello World!") ". Today I check again that everything is right but -1) I lanch the docker docker run -p 9001:9001 andypetrella/spark-notebook:0.7.0-scala-2.11.8-spark-2.1.0-hadoop-2.7.2-with-hive (dowload is now done) now 2)I repoen the docker run -p 9001:9001 andypetrella/spark-notebook:0.7.0-scala-2.11.8-spark-2.1.0-hadoop-2.7.2-with-hive 3) I display well http://localhost:9001/notebooks/JustEnoughScalaForSpark.snb?read_only=1 4) I get stuck with a "Kernel starting please wait" and cannot execute the sanity check. Did I miss something ?
I have a question : I am on windows and I am using the docker image. In that case is Java installation required on my windows 10 ?
Hi, Céline. You don’t need Java if you’re using the Docker image. I’m sorry it seems to have trouble running. I am flying to London this evening. I’ll investigate Monday and try to have a fix that evening.
Hi Dean, I attended your session at Strata in London but never managed to get the demos working. I Want to fix that now. I have Docker installed and the image is up to date. where it all goes wrong is when i try and connect to localhost:9001
I get: localhost refused to connect. Search Google for localhost 9001 ERR_CONNECTION_REFUSED
sorry for what probably is a basic question
Hi, Martin. Thanks for attending and sorry for the hassle.
Did you pass the -p 9001:9001 argument?
docker run -p 9001:9001 andypetrella/spark-notebook:0.7.0-scala-2.11.8-spark-2.1.0-hadoop-2.7.2-with-hive
Hi, I have downloaded spark-notebook-0.7.0-scala-2.11.8-spark-2.1.0-hadoop-2.7.2-with-hive, however, when I am in this folder and run bin/spark-notebook, it asks for my permission. when I use sudo, it says command sudo: bin/spark-notebook: command not found
I have verified my java version as: java version "1.8.0_121"
any advice on how to open the spark-notebook? thanks
I think I just solved it...thanks
I have uploaded the "JustEnoughScalaForSpark.snb" from the GitHub location, onto the Notebook. However, I don't see it added to the notebooks list. I chose "click here" to upload the notebook. Any help?
for people having issues seeing the notebook when using Docker, such as @EinserViech_twitter: you might need to refer to the IP for your docker-machine, e.g. 192.168.99.100:9001 rather than localhost:9001
In addition, if you're using docker, the files will be downloaded inside the docker container instead of locally to your computer. you can see them by executing bash inside the running container. do this in a separate terminal: docker exec -it <id of your container> /bin/bash - you'll get a command prompt. then ls and you should see the data/shakespeare directory
Hi - I have a question to scala and Spark, not sure if this is the right forum. If not please direct me to the correct forum. In the Scala class now we talked about Dataframes and how to use them with Scala. I would like to know how I can execute Spark SQL queries in parallel in a Spark Streaming application. Should I use Scala Futures to submit each dataframe aggregation and will those be executed concurrently?
*typo - I have a question regarding Scala and Spark
I'm having the same problem as @mannit . I'm using Ubuntu 16.04, with java SDK:
java -version java version "1.8.0_131" Java(TM) SE Runtime Environment (build 1.8.0_131-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)