Getting the following error in the worker node on processing data from one of the the column in parquet file. In my local setup it is working fine with the Dask 2024.5.0 version where as I was able to replicate the error on downgrading to Dask 2024.0.0.
2024-07-31 11:54:14,380 - distributed.worker - WARNING - Compute Failed
Key: (‘to_pyarrow_string-5e4bd9cb9af8f2ac2e4e9d86e0b9a369’, 6)
Function: subgraph_callable-d1069dda-d0df-468e-a267-30bda267
args: ( id … teas…data
3 0 … b’\xb7\xee\x86?x\xb6\x0b?7V]=g\x0f\xb4<~\x9e\x…
4 0 … b’\x85\x94\xb7?S\=
Is there any way to upgrade the cluster with Dask 2024.5.0. Or is there any work around ?
Any help is highly appreciated.
2024-07-31 11:54:14,380 - distributed.worker - WARNING - Compute Failed
Key: (‘to_pyarrow_string-5e4bd9cb9af8f2ac2e4e9d86e0b9a369’, 6)
Function: subgraph_callable-d1069dda-d0df-468e-a267-30bda267
args: ( id …
kwargs: {}
Exception: ‘UnicodeDecodeError(“'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte: Error while type casting for column 'metadata'”)’
Setup the Kubernetes cluster in Azure using
helm install dask dask/dask --set scheduler.image.repository=
and it default to Dask 2024.0.0…
Thank You,
Renjith R