Dask Scheduler Dashboard via Jupyter Proxy

We are using Jupyter Proxy in the Sagemaker Studio environment to proxy connections to Dask. Dask Clusters are maintained as SageMaker Processing Jobs.

URL pattern looks like,
https://{notebook hash}.studio.us-east-1.sagemaker.aws/jupyterlab/default/proxy/{VPC IP}:8787/status

The issue is that we are not able to proxy websockets which are initiated from Bokeh.

Is there a proven pattern to allow Dask Dashboard to be exposed via Jupyter Proxy?

We have successfully been able to proxy using Kubeflow and Virtual Services. However, it is challenging when you need to tunnel the traffic over Jupyter Proxy.

Additionally, we have a question of whether the Dask Extensions work as expected from SageMaker environment. Please see attached screenshot. Not sure if this is related to the missing yaml configuration.

Thanks.

Websockets Error - Bokeh

Dask Extension issue
Dask Extension issue

There’s a long-running issue about this here:

Our understanding is that this has to be fixed on the SageMaker side. There are two things you might try:

  1. Bugging your AWS rep to add internal pressure to this issue within AWS
  2. Using some other approach (I’ll suggest coiled.io but I’m biased)
2 Likes

Hi Matthew,

Thank you so much for the update and link to the issue. We were able to work around the issue, as long as the scheduler gets deployed in private VPC Subnets and SageMaker domain is in VPC-only mode.

From the issue, I see the websocket issue is likely on the AWS Control Plane. I will try engage with them to add additional support for Dask. In fact we already in chat with AWS reps in this regard.

Appreciate the quick response and providing relevant issue in github.