We upgraded to
dask 2024.2.1
dask-expr 0.5.3
pandas 2.2.0
Python 3.10.9
But we getting error while trying to run with dask.config.set({‘dataframe.query-planning’: True})
No issue if ({‘dataframe.query-planning’: False}
dask.config.set({'dataframe.query-planning': True})
dask.config.set({"dataframe.convert-string": True})
File "/usr/lds20/lib/python3.10/site-packages/dask_expr/_collection.py", line 418, in compute
return DaskMethodsMixin.compute(out, **kwargs)
File "/usr/lds20/lib/python3.10/site-packages/dask/base.py", line 375, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/usr/lds20/lib/python3.10/site-packages/dask/base.py", line 661, in compute
results = schedule(dsk, keys, **kwargs)
File "/usr/lds20/lib/python3.10/site-packages/distributed/client.py", line 2244, in _gather
raise exception.with_traceback(traceback)
distributed.scheduler.KilledWorker: Attempted to run task ('readparquetfsspec-fused-assign-813554ebc1dd0e08a725e6f0226f41f1', 15) on 4 different workers, but all those workers died while running it. The last worker that attempt to run the task was tcp://10.120.105.227:21149. Inspecting worker logs is often a good next step to diagnose what went wrong. For more information see https://distributed.dask.org/en/stable/killed.html.
dask.config.set({'dataframe.query-planning': True})
dask.config.set({"dataframe.convert-string": False})
File "/usr/lds20/lib/python3.10/site-packages/dask_expr/_collection.py", line 418, in compute
return DaskMethodsMixin.compute(out, **kwargs)
File "/usr/lds20/lib/python3.10/site-packages/dask/base.py", line 375, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/usr/lds20/lib/python3.10/site-packages/dask/base.py", line 661, in compute
results = schedule(dsk, keys, **kwargs)
File "/usr/lds20/lib/python3.10/site-packages/distributed/client.py", line 2245, in _gather
raise exc
concurrent.futures._base.CancelledError: ('repartitiontofewer-e7c56c48032e07c611c98b69ce76cefa', 0)
Package list:
Package Version
------------------ ---------
bokeh 3.3.4
click 8.1.7
cloudpickle 3.0.0
contourpy 1.2.0
cx-Oracle 8.3.0
dask 2024.2.1
dask-expr 0.5.3
distributed 2024.2.1
fsspec 2024.2.0
greenlet 3.0.3
importlib-metadata 7.0.1
Jinja2 3.1.3
locket 1.0.0
lz4 4.3.3
MarkupSafe 2.1.5
modin 0.27.0
msgpack 1.0.7
numpy 1.26.4
packaging 23.2
pandas 2.2.0
parquet 1.3.1
partd 1.4.1
pillow 10.2.0
pip 23.3.1
ply 3.11
psutil 5.9.8
pyarrow 15.0.0
pyarrow-hotfix 0.6
pysqlite3 0.5.2
python-dateutil 2.8.2
pytz 2024.1
PyYAML 6.0.1
setuptools 69.0.2
six 1.16.0
sortedcontainers 2.4.0
SQLAlchemy 2.0.27
tblib 3.0.0
thriftpy2 0.4.17
toolz 0.12.1
tornado 6.4
typing_extensions 4.9.0
tzdata 2024.1
urllib3 2.2.0
wheel 0.42.0
xarray 2024.1.1
xyzservices 2023.10.1
zict 3.0.0
zipp 3.17.0