Could not create cluster with Dask Gateway on k8s

Hi everyone,

When I try to create a cluster via dask-gateway on the local k8s cluster; I get an error message like below.
"GatewayClusterError: Cluster ‘dhub.b203c688121d45d3b219b0da48dabd7c’ failed to start, see logs for more information"

When I check the dask- gateway controller logs, I can see the related cluster was created but after that it has been deleted.

I used dask-gateway 0.8.0 version for gateway and controllers. For the backend side I used belows version. Actually I create custom image with following packages.

    aiohttp=3.6.2 \
    dask==2.21.0 \
    distributed==2.21.0 \
    numpy==1.19.1 \
    pandas==1.0.5 \
    bokeh==2.2.3 \

@consideRatio Hi Erik ! Could you help me about this issue ? Thanks in advance

Hi @menendes! Thanks for the question. I’m no k8s expert, but maybe @jacobtomlinson has some thoughts on how to help here?

Hi @scharlottej13 ! Thank you for your response. The reason of the use custom images is that when I used to daskhub helm packages I can not connect existing cluster. When I search the error, people suggest create custom image with above the packages. I created the custom images and I tried to set backend image with custom image in the helm values.yaml file. After that when I deploy the helm package, I can connect the dask gateway but I can not create cluster. When I check the logs the output logs looks like screenshot in the above. Actually we can verify cluster is created but cluster stuck in the “containerCreating” status and its terminating immediately. I think something went wrong while creating a cluster container(scheduler) and reason of that can be incompatible library versions but I am not sure. I still search the problem and stuck in there.

Hi @menendes, is there a reason you used so old versions of Dask and Dask Gateway?

First thing I would recommend is trying to use up to date versions of the libraries, can you do that?

3 Likes

One thing that might help: sharing the yaml file you’re using to configure your daskhub.

3 Likes

Hi everyone,

@guillaumeeb yes you are right. Problem related with versions issues and when I upgrade versions everythings is okay now. Thanks for all your responses

4 Likes

I get this error as well. I believe the reason is the default docker image used for the scheduler and workers: daskgateway/dask-gateway:0.9.0. That has an old dask version installed (2.30) as the image is one year old. An upgrade would be very mich appreciated.

Cheers

H

4 Likes

I just wanted to update folks here that we are making slow progress with getting a dask-gateway release out. You can track things here.

1 Like