Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    David M.
    @david1155
    Toree version 0.4.0
    David M.
    @david1155
    It seems that I have to downgrade Python 3.8->3.7
    David M.
    @david1155
    [ToreeInstall] ERROR | Unknown interpreter PySpark. Skipping installation of PySpark interpreter
    Zakk
    @a1mzone

    Hi @s3uzz

    Yes you could try an older python, I am still using 3.6 as mentioned previously.

    Below is a small example bash script I use to install

    #!/bin/sh
    export VERSION=0.1.0
    export SPARK_HOME=/path/to/sparkHome
    jars="file:///path/to/jar"
    jars="$jars,file:///path/to/jar"
    
    jupyter-toree install \
        --replace \
        --debug \
        --user \
        --kernel_name "project $VERSION" \
        --spark_home=${SPARK_HOME} \
        --spark_opts="--master yarn --jars $jars"
    David M.
    @david1155
    Zakk, thank you. Maybe intepreter PySpark is not available in version 0.4.0> I use
    jupyter toree install --spark_home=${SPARK_HOME} --interpreters=Scala,PySpark,SQL --python_exec=/opt/conda/bin/python3.8
    and get error
    [ToreeInstall] ERROR | Unknown interpreter PySpark. Skipping installation of PySpark interpreter
    5 replies
    amitzo
    @amitzo
    Hi, I am running toree 0.5.0 rc4 with Spark 3.1.2, everything is working fine execept there is a problem with errors not displaying in the cell output while they do show up in the text if I do "Download as notebook .pynb". For example if I type "blah" in the cell and run it I see a blank response in the notebook, but this value in the downloaded file - {
    "cell_type": "code",
    "execution_count": 2,
    "id": "68eb215f",
    "metadata": {},
    "outputs": [
    {
    "ename": "Compile Error",
    "evalue": "<console>:26: error: not found: value blah\n blah\n ^\n",
    "output_type": "error",
    "traceback": []
    }
    ],
    "source": [
    "blah"
    ]
    } Is this a bug or am I missing some configuration? How do I get these types of error to display in the cell output?
    Rahul Goyal
    @rahul26goyal
    Hi team
    the issue that @amitzo has brought up, we are seeing similar issue and i think this is preventing a the end users from getting to know what exactly happened in the backend..
    is there a ticket open already on this ? what is the plan in general to improve and address user experience gaps from the community.
    I will be more than happy to help out if someone can guide on this.
    thanks
    Kevin Bates
    @kevin-bates

    Here’s a link to a related Toree JIRA: https://issues.apache.org/jira/browse/TOREE-522?jql=project%20%3D%20TOREE%20AND%20component%20%3D%20Kernel
    In there @lresende points out that this appears to be a function of the front-end - whether it is Notebook or Lab. I see the same behavior. In fact, when I run using the classic Notebook front-end, I don’t see the error when its syntax related (e.g., “blah”), but do get some error result for something like a divide by zero (e.g., “1/0”). However, what is interesting is that if I save that notebook and then open it in Lab, I see the appropriate messages in both cases. So, as @amitzo is pointing out, the error details are persisted in the notebook, but something about the “display” that’s not working.
    I’m curious if you see any error output resulting from divide-by-zero issues?

    Here are two screen shots, the first using Notebook and the second using Lab where, for the second, I merely opened the notebook file produced from the first...

    Screen Shot 2022-05-03 at 11.36.19 AM.png
    Screen Shot 2022-05-03 at 11.38.28 AM.png
    Omkar Kalange
    @komkar123
    Kernel dies and does not restart
    kernel.json : {
    "argv": [
    "/home/ec2-user/.local/share/jupyter/kernels/apache_toree_scala/bin/run.sh",
    "--profile",
    "{connection_file}"
    ],
    "env": {
    "DEFAULT_INTERPRETER": "Scala",
    "TOREE_SPARK_OPTS": "--master=spark://ip-172-31-1-122.ec2.internal:7077",
    "TOREE_OPTS": "",
    "SPARK_HOME": "/home/ec2-user/spark"
    },
    "display_name": "Apache Toree - Scala",
    "language": "scala",
    "interrupt_mode": "signal",
    "metadata": {}
    Luciano Resende
    @lresende
    well, information from the actuall logs or anything related to failures would be much more helpful
    Omkar Kalange
    @komkar123
    image.png
    AsyncIOLoopKernelRestarter: restart failed
    I tried uninstalling jupyter and toree
    installed it back but still same error
    Kevin Bates
    @kevin-bates
    Hi @komkar123 - thanks for the additional information. The issue is that the Toree kernel instance can’t get fully started. As a result, the kernel restarter detects its death and attempts to start the kernel again (4 more times). So the Jupyter side of things is working as expected and you should focus on resolving the issue in the stack trace. Is there any additional information before or after what is pictured? I’m assuming this occurs immediately at startup, before any cell is executed, but please let us know if that’s not the case.
    Omkar Kalange
    @komkar123
    Yes, this happens when I select Apache Toree as kernel in Jupyter, before executing any piece of code.
    Omkar Kalange
    @komkar123

    image.png

    This is what happens when I start jupyter and select Apache Toree as kernel

    Omkar Kalange
    @komkar123

    jupyter toree install --user --spark_home=$HOME/spark --spark_opts="--master=spark://$MASTER_URL:7077"

    Is in stack to install toree

    Omkar Kalange
    @komkar123
    I searched for the error : java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)V and it looks like there is a version mismatch of scala
    Luciano Resende
    @lresende
    What version of Scala are you using? For latest Toree you should be using Scala 2.12
    https://github.com/apache/incubator-toree/blob/70cc22012250e09b7061211e8716fffe0ff35d76/Makefile#L34
    Omkar Kalange
    @komkar123
    image.png
    Omkar Kalange
    @komkar123
    Scala code runner version 2.12.2 -- Copyright 2002-2017, LAMP/EPFL and Lightbend, Inc.
    This is the version of scala installed on my machine. However, when I start spark 2.4.6 using spark-shell, it uses scala 2.11.12
    Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_292)
    Luciano Resende
    @lresende
    You should install Toree 0.4 for Spark 2.4.6 and Scala 2.11
    Toree 0.5, by default, is build with Spark 3 + Scala 2.12
    Omkar Kalange
    @komkar123
    Oh, can u please tell me how can I run install toree command with 0.4?
    Luciano Resende
    @lresende
    pip install toree==0.4.0
    Omkar Kalange
    @komkar123
    Thank you so much, I really appreciate your timely help.
    It worked, the stack was created some time back in last year so running it now installed latest toree
    Luciano Resende
    @lresende
    great to hear that
    Andres Rodriguez
    @acrodriguez_twitter
    Hello. I am trying to pass some data from a Scala cell to a JS cell so that I can perform a (graph) visualization via D3. I cannot seem to find the right documentation. The DataBricks website says to use displayHTML, but that does not seem to be available. Is it possible? Thanks much 🙏
    dhia-gharsallaoui
    @dhia-gharsallaoui
    Hi there, I want to add apache toree to my jupyterhub. my problem is that all my jars are compiled using scala 2.13.5 + spark 3.2.1. when I add toree kernel it doesn't start and this is because of the version I think because toree is built with scala version 2.12. Also I tried to rebuild toree with scala 2.13 but i got many errors. Any propostion thank you
    2 replies
    image.png
    jbguerraz
    @jbguerraz

    Hi there, I want to add apache toree to my jupyterhub. my problem is that all my jars are compiled using scala 2.13.5 + spark 3.2.1. when I add toree kernel it doesn't start and this is because of the version I think because toree is built with scala version 2.12. Also I tried to rebuild toree with scala 2.13 but i got many errors. Any propostion thank you

    Is there any open issue related to scala 2.13 adoption ? I didn't find any on Jira but I'm not so used to it so maybe I just missed it

    1 reply
    Nhat Nguyen
    @ntnhaatj
    hi guys, I got trouble when trying to export environment from toree jupyter, could anyone help to lighten me how can do that please? Thanks a lot
    %export env=val
    Bostonian
    @athensofamerica
    When I install toree with "--interpreters=Scala,PySpark,SparkR,SQL", I only see Scala and SQL. Are PySpark and SparkR interpreters no long supported?
    2 replies
    RaSi96
    @RaSi96
    Screenshot_20221104_140426.png
    Greetings all, good day. I've just done a fresh installation of Arch Linux 6.0.6-arch1-1, Jupyter Lab, Spark 3.3.1 (spark-3.3.1-bin-hadoop3-scala2.13) and Toree with everything going well. Upon launching Jupyter and opening a new notebook with Toree, my kernel keeps disconnecting with the above error.
    (I wasn't expecting my photo to upload before my text did, apologies if I confused anybody).
    Some very shallow Google-Fu showed me that it might be a Scala version mismatch? I'm unsure how to go about launching Toree now; my $SPARK_HOME and $JAVA_HOME work fine because I'm able to navigate to $SPARK_HOME/bin and launch a spark-shell.
    RaSi96
    @RaSi96

    Hi there, I want to add apache toree to my jupyterhub. my problem is that all my jars are compiled using scala 2.13.5 + spark 3.2.1. when I add toree kernel it doesn't start and this is because of the version I think because toree is built with scala version 2.12. Also I tried to rebuild toree with scala 2.13 but i got many errors. Any propostion thank you

    Well that's the end of that then, I completely scrolled past this exact same issue. Apologies for wasting everyone's time :)

    RaSi96
    @RaSi96
    Sorry to revive my problem again, but after matching Scala versions and launching, I'm met with a java.lang.UnsupportedOperationException: The Security Manager is deprecated and will be removed in a future release; consequently my kernel never manages to connect and is left in a perpetual "connecting" status
    Screenshot_20221104_164535.png
    This is with Arch Linux 6.0.6, Spark 3.3.1 with Hadoop 3 and Scala 2.12.15 (direct from Apache's Spark download webpage), and Jupyter Lab 3.5.0
    Is there any way I could look into this or work around it?