Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • May 26 02:00
    kevin-bates labeled #1097
  • May 26 01:46
    pre-commit-ci[bot] synchronize #1097
  • May 26 01:46
    kevin-bates review_requested #1097
  • May 26 01:46
    kevin-bates review_requested #1097
  • May 26 01:46
    kevin-bates milestoned #1097
  • May 26 01:46
    kevin-bates labeled #1097
  • May 26 01:46
    kevin-bates opened #1097
  • May 25 19:00
    kevin-bates review_requested #1095
  • May 25 19:00
    kevin-bates review_requested #1095
  • May 25 18:49
    kevin-bates commented #1086
  • May 25 13:54
    kevin-bates labeled #1053
  • May 25 13:54
    kevin-bates labeled #1053
  • May 25 13:53
    kevin-bates labeled #1053
  • May 25 13:51
    kevin-bates milestoned #1095
  • May 25 00:15
    pre-commit-ci[bot] synchronize #1095
  • May 25 00:14
    kevin-bates labeled #1095
  • May 25 00:14
    kevin-bates assigned #1095
  • May 25 00:14
    kevin-bates labeled #1095
  • May 25 00:14
    kevin-bates opened #1095
  • May 24 00:49
    kevin-bates labeled #1094
coder-forever-2020
@coder-forever-2020
Is there any suggestion to make the remote kernel interactive/
Thanks!
Kevin Bates
@kevin-bates
Hi @coder-forever-2020. This implies there’s a disconnect between the Notebook/Lab instance and EG or EG and the remote kernel. If the kernel names presented in the list appear to be kubernetes related, then your issue is probably the latter. Other factors that can come into play is the initial seeding of the kernel images can take a while and if the KernelImagePuller daemonset is not working correctly, can lead to timeout issues when attempting initial kernel startups on a given node. All in all, it can take a little bit to get all the communications working correctly.
Please take a look at your EG pod logs and see if anything can be gleaned from them as to what might going on. Should you still be stuck, please open an issue in the repo and provide the output of the EG pod logs, as well as a screenshot or two of the notebook that includes the kernel name, it’s status and the executed cell. Thanks.
coder-forever-2020
@coder-forever-2020
Thanks @kevin-bates will do.
coder-forever-2020
@coder-forever-2020
@kevin-bates and all, it turns out tornado 6.x is not compatible with Jupypternotebook 6. I got this warning: "RuntimeWarning: coroutine 'WebSocketHandler.get' was never awaited"
After I downgrade tornado to 5.1.1, the whole integration with remote kernel works like a charm. Thanks everyone for your effort put into this amazing project.
Kevin Bates
@kevin-bates
Hmm - I haven't seen that recently. Please ensure you're running EG 2.2 and Notebook 6.1+. Is that being produced on the client or EG side of things? Please provide a complete traceback of the exception and surrounding log entries.
coder-forever-2020
@coder-forever-2020
Yah, the tornado version issue is under EG 2.2 and below is the detailed jupyter component version.
jupyter core : 4.6.3
jupyter-notebook : 6.1.3
qtconsole : not installed
ipython : 7.17.0
ipykernel : 5.3.4
jupyter client : 6.1.6
jupyter lab : 2.2.5
nbconvert : 5.6.1
ipywidgets : not installed
nbformat : 5.0.7
traitlets : 4.3.3
The warning is produced at JupyterHub's notebook pod.
Kevin Bates
@kevin-bates
Hmm - I guess I was under the impression the issue was occurring where EG is running, but your last comment implies its where notebook is running - which makes sense. Does that fit your scenario?
Could you please provide the traceback you’re seeing? And is the image available for me to pull and look at? Thanks.
coder-forever-2020
@coder-forever-2020
@kevin-bates The jupternotebook docker that got the warning is jupyter/minimal-notebook:latest The trace back only have the warning "RuntimeWarning: coroutine 'WebSocketHandler.get' was never awaited"
May I also check in EG v2.2, can I configured to allow enterprise-gateway pod to pass any environment variables to remote kernel? The use case is to dynamically provisioning kernel's user account for access control.
Kevin Bates
@kevin-bates

@coder-forever-2020 - where are you using jupyter/minimal-notebook:latest? Is this for your client-side notebook server? Please include the full traceback rather than just the "was never awaited" message.

Regarding additional envs, besides unconditional KERNEL_-prefixed envs, you can add a list of env names to EG_ENV_WHITELIST or from the CLI --EnterpriseGatewayApp.env_whitelist which can also be included:

--EnterpriseGatewayApp.env_whitelist=<List>
    Default: []
    Environment variables allowed to be set when a client requests a new kernel.
    (EG_ENV_WHITELIST env var)
coder-forever-2020
@coder-forever-2020
Thanks a lot
coder-forever-2020
@coder-forever-2020
@kevin-bates If we would like to allow user specify kernel parameters, e.g. #GPU when requesting a remote kernel, do you have any suggestions? Thanks.
Kevin Bates
@kevin-bates
This depends on whether or not you own the UI that would allow the user to specify such parameters. However, the answer (at this time) is still the same.... Short of adding proper support for parameterized kernel launch, the only way this can be accomplished is to use KERNEL_ env variables, then modify the kernelspec files to use those values accordingly.
If you own the UI, then you can set the KERNEL_ values (e.g., KERNEL_NUM_GPUS=2) into the env: stanza of the kernel start request body. If you don't have control over the UI, then those envs would need to be present in the notebook process making the kernel start request since the gateway logic automatically flows any KERNEL_ envs.
Rick Lamers
@ricklamers

I noticed the latest version of JupyterLab (3.0.0-rc.8) doesn't have the check_origin() that was added to the gateway websocket handler (https://github.com/jupyter/jupyter_server/blob/master/jupyter_server/gateway/handlers.py#L35).

In general, does Enterprise Gateway already support JupyterLab 3.0.0? Because even with the above patched, I couldn't get kernel messages flowing from JupyterLab (3.0.0-rc.8) to EG (tested with 2.2). No errors, just stuck (kernel had lightning icon) executing a cell.

Luciano Resende
@lresende
@ricklamers we will take a look at it, feel free to create an issue on EG to track the issue
Rick Lamers
@ricklamers
I’ll first try to create a minimal reproducable setup on 2.3. I’ll also submit a PR to JLab master for the handler.
If the problem persists in the minimal reproducable setting I’ll create an issue for tracking! Thanks for replying
Rick Lamers
@ricklamers

Seems like same issue happens on 2.3 + 3.0.0-rc9. Created a PR with jupyter/jupyter_server for handler.py and created an issue for tracking interop on Enterprise Gateway.

Could be related to awaiting connect in the handler (see issue #903).

Kevin Bates
@kevin-bates

Thanks Rick. Yes, the check_origin change is good. However, I'm still unable to complete the kernel's startup from lab3 to EG. I'm not seeing a kernel-info-request/reply sequence that typically occurs during the establishment of the websocket - which I'm not seeing either.

There shouldn't be anything required on the EG side of things, but I need to ensure there aren't any missing PRs in notebook that belong in jupyter-server - like the check_origin change.

Ivar Stangeby
@qTipTip

Hey!

I love the idea of Enterprise Gateway and would love to take it for a spin. I am having some installation issues. Is there any way of testing this locally in KIND? (Kubernetes in docker?). I am running into issues with the kernel-image-puller not being able to connect. Error 111.

│ Traceback (most recent call last):                                              │
│   File "./kernel_image_puller.py", line 25, in <module>                         │
│     docker_client = DockerClient.from_env()                                     │
│   File "/usr/local/lib/python3.8/site-packages/docker/client.py", line 84, in f │
│ rom_env                                                                         │
│     return cls(                                                                 │
│   File "/usr/local/lib/python3.8/site-packages/docker/client.py", line 40, in _ │
│ _init__                                                                         │
│     self.api = APIClient(*args, **kwargs)                                       │
│   File "/usr/local/lib/python3.8/site-packages/docker/api/client.py", line 188, │
│  in __init__                                                                    │
│     self._version = self._retrieve_server_version()                             │
│   File "/usr/local/lib/python3.8/site-packages/docker/api/client.py", line 212, │
│  in _retrieve_server_version                                                    │
│     raise DockerException(                                                      │
│ docker.errors.DockerException: Error while fetching server API version: ('Conne │
│ ction aborted.', ConnectionRefusedError(111, 'Connection refused'))
Kevin Bates
@kevin-bates
Hi @qTipTip. I'm not sure what is going on with the KernelImagePuller, it can be a little finicky. If you could provide some more details, go ahead and open an issue in the repo. However, its functionality is more for larger clusters and you should still be able to complete your "spin" w/o it. KIP is helpful because the image download tends to take longer than the kernel startup timeout - so the idea is that it will have pulled the images prior to the first request. You could either preload your kernel images or suffer a timeout and try to start after their initial reference has completed the download.
The image names are baked into the kernelspec files located in the EG image's /usr/local/share/jupyter/kernels directories. I'd just pick a couple kernelspecs you're interested in and pull those images.
1 reply
dummys
@dummys
hey guys
Kevin Bates
@kevin-bates
Hi - I just replied to you on DM.
Ceyda Cinarel
@cceyda
which version of jupyterlab do I need to work on master branch of enterprise gateway?
Kevin Bates
@kevin-bates
Hi Ceyda, EG master is based on jupyter_server, so you’ll need lab 3.x.
gyang321
@gyang321
Hello, We are working set up spark3 in EG. We saw an open issue for spark3. Is there update for it? Thanks
Rahul Goyal
@rahul26goyal
hi @kevin-bates : does "spark_python_kubernetes_kernel" support magic commands like "%config" to specify dynamic spark configs?
Kevin Bates
@kevin-bates
EG essentially creates the spark context via the kernel launch sequence (to basically emulate how Apache Toree works) so, again, I suspect the context will already be created. The idea is that the variances used to create the context be conveyed via “parameters” which, today, merely consist of various KERNEL_ env values and a bit clunky.
Rahul Goyal
@rahul26goyal

right..
Thinking out loud here:
I think these types of magic needs to be supported by the UI (notebook / jupyterlab) ..currently these client done expose any starter/ bootstrap cell space to take input from users..not having that is leading to extra work on kernels to support such common requirements..

In a shared space, its not possible to set a single valueKERNEL_ * for all the users ..each uses have their own set of requirements in terms spark driver and executor memory etc..

Kevin Bates
@kevin-bates
Correct. This is where we should introduce Parameterized Kernel Launch (although the referenced JEP should probably be rebooted since I can never find time to move it forward). The idea here is that most parameters are really environmental in nature (i.e., affect the kernel’s runtime environment) and only the kernel provisioner is “aware” of those parameters. Yes, there are kernel-specific parameters as well, but, for the most part, parameters really apply to the provisioned environment.
Rahul Goyal
@rahul26goyal
I will read though the proposal today .
Rahul Goyal
@rahul26goyal
does JEG support running Sparkmagic kernels out of the box?
Kevin Bates
@kevin-bates
I’m not familiar with what you mean by “sparkmagic kernels”.
Regarding the ‘out of the box’ portion, the kernelspecs that EG does (currently) provide should be considered examples with the intent being that they be modified to suit the needs of the configuration in which they’re deployed. EG’s process proxiess (like kernel provisioners) are intended to be kernel-agnostic (although their respective launchers tend to adhere to the kernel’s language) and are unaware of what kind of kernel they are launching. As with any jupyter kernelspec, they merely invoke the kernelspec’s argv stanza. What they do however is discover the kernel’s destination and manage its lifecycle using the proxy/provisioner - which typically interacts with the resource-managed cluster. Therefore, there should be nothing needed from EG or its process proxies to support a “sparkmagic kernel”.
Rahul Goyal
@rahul26goyal
makes sense kevin.."sparkmagic" is not really a kernel ..my bad for calling it a kernel.
I am actually not vey familier with sparkmagic..just that it came up in few discussions and I have started to read about it little...
from what I understand, sparkmagic has support for bunch of kernels as listed here: https://github.com/jupyter-incubator/sparkmagic/tree/master/sparkmagic/sparkmagic/kernels and there is also dependency with Livy to execute the actual code..
so, I am not sure if i m making any sense here: But is "sparkmagic" a different type of kernel manager/ proxy which acts like a bridge between UI and the Livy to execute the spark code.. !
Kevin Bates
@kevin-bates
After poking around the linked repo (thanks) the spark magic kernels derive from IPythonKernel so, in that sense, they are just kernels. However, the EG spark kernels are launched using spark submit and doing something similar for the Spark magic kernels would likely interfere with the Livy layer.
To use these from EG i suspect you’d need to convey to the kernel launcher script that it should embed a spark magic kernel instead of ipykernel and then interact with spark via Livy, but i can’t comment on that and not sure if these thoughts even make sense. Sorry
Rahul Goyal
@rahul26goyal
@kevin-bates : I was able to test few things with JEG:
I installed sparkmagic kernels specs in its current state into the JEG container and was able to spin kernel from notebook UI.. but the kernel was running locally as expected and was able to connect to the livy server which was running remotly...but ideally, we would want sparkmagic to also run outside of the JEG container?
Rahul Goyal
@rahul26goyal

its not very straight forward as to what should be the right way to integrate sparkmagic (SM) based kernels on JEG:

  1. should the SM kernel run locally and Livy can be a long running server running in a separate container? If so, will the memory / cpu footprint of SM interfere with JEG ..this also introduces the overhead of managing Live as new service for the user..

  2. Should we bundle SM and Livy together as a single container and each time a SM kernel is launched, we launch this container...With this approach we will need to write 2 new scripts: 1. for JEG side kernel_launcher. 2. the script that would start livy and sparkmagic and communicate the connection_file information back to JEG.

  3. other Alternates are welcome :)

Kevin Bates
@kevin-bates

I'm not very familiar with Livy but, looking at its documentation, it essentially manages multiple spark contexts (probably spark drivers depending on how its configured). As a result, you'd probably be okay running the SM kernel locally - as you have done - as they should be fairly lightweight. The down side to that is the EG pod would present a single point of failure and I know that you've expressed interest in some of the EG "opportunities" for HA/DR - which only apply to remote kernels.

If you wanted to place the SM in their own pods, you'd need to extend the python kernel launcher to know it should instantiate an SM kernel (class) rather than IPythonKernel (which it embeds right now). I don't know if you could do something similar with this extended launcher, but the class name (at a minimum) would need to be "plumbed" as a parameter. It may be easier to create another - which could prevent the plubming of the parameter perhaps.

What is your reason for needing to use Livy? Is it acting as a bridge to a Hadoop/YARN configuration from Kubernetes? Just curious why you're not using Spark, and the Spark-K8s kernel examples, directly in K8s.

cc: @lresende since he knows more about Livy than myself.

Luciano Resende
@lresende
@kevin-bates in eg, if using crd processproxy, is there a way to disable eg managing the namespace? as in not creating the one namespace per kernel as the spark operator knows how to handle that?
Kevin Bates
@kevin-bates
I haven’t used it, but here’s the PR: jupyter/enterprise_gateway#991.
The doc update indicates a well-known namespace needs to be used, but that’s essentially the “byo namespace” approach. I believe EG expects a kernel-associated namespace to exist prior to the pod's deployment, so that doesn’t sound compatible with what you’re talking about. Does the spark operator tolerate existing namespaces? If so, then setting KERNEL_NAMESPACE to a known value might be what you want.
Luciano Resende
@lresende
Yes, the namespace is there and available, but it’s not constant and I was trying to avoid the management of that piece on EG… something like BYO but don’t care about passing the namespace on the CRD… I will play with it a bit more and report back here.
Kevin Bates
@kevin-bates
ok - sounds good. BYO is what you’d have to use. The namespace used on the CRD, I believe, needs to be known to EG - so that might be a problem. (sorry, just noticed this sitting in an “un-sent” state, fwiw.)
Luciano Resende
@lresende
:) for now i am ok, but will reach out when i get back to this
Kevin Bates
@kevin-bates
[ANN] Jupyter Enterprise Gateway 2.6.0 has been released: https://github.com/jupyter/enterprise_gateway/releases/tag/v2.6.0
Luciano Resende
@lresende
Congrats !!!
Kevin Bates
@kevin-bates
Thanks to all (including you @lresende :smile:)!
s3uz
@s3uzz

Hi!
Kindly ask to help to solve my problem. I use setup JupyterHub <-> JupyterLab (docker) <-> Jupyter Enterprise-Gateway(docker) <-> YARN Hadoop Cluster Mode
notebook=6.2.0
jupyterhub=1.3.0
jupyterlab=3.0.12
Remote kernel takes 5 minutes to start, so I set following env's:
KERNEL_LAUNCH_TIMEOUT=900
EG_KERNEL_LAUNCH_TIMEOUT=900
KG_REQUEST_TIMEOUT=900
EG_CULL_IDLE_TIMEOUT=43200
EG_CULL_INTERVAL=900
EG_CULL_CONNECTED=False
KG_REQUEST_TIMEOUT=900
KG_CONNECT_TIMEOUT=900
When I start docker with user notebook as "docker run ... jupyter lab" then everything is perfect, remote kernel starts in 3-5 minutes respecting env's and writing to logs:
Error attempting to connect to Gateway server url.......... until the kernel gets finally connected.
But when I use JupyterHub to spawn user notebooks with DockerSpawner, the notebook always timeouts 504 gateway timeout in 60 seconds.
I have tried to set config in jupyterhub:
c.JupyterHub.service_check_interval = 180
c.Spawner.args = ['--GatewayClient.connect_timeout=900','--GatewayClient.request_timeout=900','--AsyncMappingKernelManager.kernel_info_timeout=900','--AsyncMultiKernelManager.use_pending_kernels=True','--AsyncMultiKernelManager.use_pending_kernels=43200']
c.JupyterHub.tornado_settings = {
'slow_spawn_timeout': 0,
'connect_timeout': 0,
'request_timeout': 0,
'raise_error': False,
}
c.Spawner.start_timeout = 900
c.JupyterHub.service_check_interval = 900
c.Spawner.environment = {'KERNEL_LAUNCH_TIMEOUT': 900,'EG_KERNEL_LAUNCH_TIMEOUT': 900, 'KG_REQUEST_TIMEOUT': 900, 'EG_CULL_IDLE_TIMEOUT': 900, 'EG_CULL_INTERVAL': 900, 'EG_CULL_CONNECTED': False, 'KG_CONNECT_TIMEOUT': 900, 'KG_REQUEST_TIMEOUT': 900}

But nothing helps. When notebook is started with "jupyter labhub" kernel does not respect timeout for connecting to remote gateway.
When I start notebook as "jupyter lab" the kernel does not timeout and starts in several minutes.

Kindly ask to help to increase gateway timeout when spawning notebooks with JupyterHub. Thank you very much in advance.

8 replies