Access the Dask cluster using Jupyter Extensions

Hi Team,

I had few questions related to integrating Jupyter Extensions with the KubeCluster (Operators).

  1. Is there a way to enable Jupyter extension logging to troubleshooting what is the issue the backend and frontend are having?

  2. When I try to create a new cluster from the UI with KubeCluster Spec it gives me,
    Timeout error.

Cluster failed to start: Timed out during handshake while connecting to tcp://dask-jovyan-a77f1b85-4.namespace:8786 after 30 s
I updated the config files (labextension.yaml and kubernetes.yaml) under ~/.config/dask/
How does the classic KubeCluster definition differ from Operator when being specified in kubernetes.yaml.

  1. How does the auto-start Dask inject the Dask client into the notebooks?

  2. How can I connect a cluster created from code let’s say Kubernetes Operators back to the Dask cluster. I tried providing http://{cluster_name}-scheduler.{namespace}.svc.cluster.local:8787/ But it does not seem to connect and weird enough my developer toolbar. I see a response as follows,
    {“url”: "http:/{cluster_name}-scheduler.{namespace}.svc.cluster.local:8787/

It seems that the protocol was not picked up correctly.

Any help would be appreciated.

Thank you.

One more thing I had noticed dask-kubernetes/requirements.txt at main · dask/dask-kubernetes · GitHub may need to be updated since PDB capability is provided by k8s client starting with 21.7.0?

PDBs were added to the remote scheduler spec.

cc @ian in case you have thoughts here

Ian will likely be able to answer your labextension specific questions. From the Dask Kubernetes side you wont be able to connect to the internal k8s domain for the service unless your machine is within the k8s cluster (which I assume it isn’t). So you need to use the port forwarded URL, port forward it yourself, or expose the dashboard via another means and use that URL.

Thank you @jacobtomlinson. Understood. This makes sense. Appreciate the reply.

So, I have also tried with an external NLB LoadBalancer service over Scheduler. The Dask Clients are able to interact with over tcp/8786 for management and http/8787 for the status page.

However, even that did not work. There might be something I am doing wrong. Trying to understand what it is specifically.

I have listed some other additional details in the issue here Help connecting to existing KubeCluster using the build-in Discovery Mechanism · Issue #255 · dask/dask-labextension · GitHub

cc: @ian