Error with the dask 2024.1.0

Getting the following error in the worker node on processing data from one of the the column in parquet file. In my local setup it is working fine with the Dask 2024.5.0 version where as I was able to replicate the error on downgrading to Dask 2024.0.0.

2024-07-31 11:54:14,380 - distributed.worker - WARNING - Compute Failed
Key: (‘to_pyarrow_string-5e4bd9cb9af8f2ac2e4e9d86e0b9a369’, 6)
Function: subgraph_callable-d1069dda-d0df-468e-a267-30bda267
args: ( id … teas…data
3 0 … b’\xb7\xee\x86?x\xb6\x0b?7V]=g\x0f\xb4<~\x9e\x…
4 0 … b’\x85\x94\xb7?S\=

Is there any way to upgrade the cluster with Dask 2024.5.0. Or is there any work around ?
Any help is highly appreciated.

2024-07-31 11:54:14,380 - distributed.worker - WARNING - Compute Failed
Key: (‘to_pyarrow_string-5e4bd9cb9af8f2ac2e4e9d86e0b9a369’, 6)
Function: subgraph_callable-d1069dda-d0df-468e-a267-30bda267
args: ( id …

kwargs: {}
Exception: ‘UnicodeDecodeError(“'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte: Error while type casting for column 'metadata'”)’

Setup the Kubernetes cluster in Azure using
helm install dask dask/dask --set scheduler.image.repository=
and it default to Dask 2024.0.0…

Thank You,
Renjith R

Got resolved on creating a custom image with Dask 2024.5.0 and deploying it to the cluster. :slightly_smiling_face:

As of now Im good!!! No support needed

Hi @renjthmails,

I’m not sure of the method you used to deploy your Dask Cluster on Kubernetes, the right tool should be here: Dask Kubernetes Operator — Dask Kubernetes 2024.5.1.dev5+g734d001 documentation.
The command you used does not look the same. The last Dask version should be deployed.

1 Like