Error with dask-expr: dask.config.set({'dataframe.query-planning': True})

We upgraded to
dask 2024.2.1
dask-expr 0.5.3
pandas 2.2.0

Python 3.10.9

But we getting error while trying to run with dask.config.set({‘dataframe.query-planning’: True})
No issue if ({‘dataframe.query-planning’: False}

dask.config.set({'dataframe.query-planning': True})
dask.config.set({"dataframe.convert-string": True})
  File "/usr/lds20/lib/python3.10/site-packages/dask_expr/_collection.py", line 418, in compute
    return DaskMethodsMixin.compute(out, **kwargs)
  File "/usr/lds20/lib/python3.10/site-packages/dask/base.py", line 375, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/usr/lds20/lib/python3.10/site-packages/dask/base.py", line 661, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/usr/lds20/lib/python3.10/site-packages/distributed/client.py", line 2244, in _gather
    raise exception.with_traceback(traceback)
distributed.scheduler.KilledWorker: Attempted to run task ('readparquetfsspec-fused-assign-813554ebc1dd0e08a725e6f0226f41f1', 15) on 4 different workers, but all those workers died while running it. The last worker that attempt to run the task was tcp://10.120.105.227:21149. Inspecting worker logs is often a good next step to diagnose what went wrong. For more information see https://distributed.dask.org/en/stable/killed.html.
dask.config.set({'dataframe.query-planning': True})
dask.config.set({"dataframe.convert-string": False}) 
  File "/usr/lds20/lib/python3.10/site-packages/dask_expr/_collection.py", line 418, in compute
    return DaskMethodsMixin.compute(out, **kwargs)
  File "/usr/lds20/lib/python3.10/site-packages/dask/base.py", line 375, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/usr/lds20/lib/python3.10/site-packages/dask/base.py", line 661, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/usr/lds20/lib/python3.10/site-packages/distributed/client.py", line 2245, in _gather
    raise exc
concurrent.futures._base.CancelledError: ('repartitiontofewer-e7c56c48032e07c611c98b69ce76cefa', 0)

Package list:

Package            Version
------------------ ---------
bokeh              3.3.4
click              8.1.7
cloudpickle        3.0.0
contourpy          1.2.0
cx-Oracle          8.3.0
dask               2024.2.1
dask-expr          0.5.3
distributed        2024.2.1
fsspec             2024.2.0
greenlet           3.0.3
importlib-metadata 7.0.1
Jinja2             3.1.3
locket             1.0.0
lz4                4.3.3
MarkupSafe         2.1.5
modin              0.27.0
msgpack            1.0.7
numpy              1.26.4
packaging          23.2
pandas             2.2.0
parquet            1.3.1
partd              1.4.1
pillow             10.2.0
pip                23.3.1
ply                3.11
psutil             5.9.8
pyarrow            15.0.0
pyarrow-hotfix     0.6
pysqlite3          0.5.2
python-dateutil    2.8.2
pytz               2024.1
PyYAML             6.0.1
setuptools         69.0.2
six                1.16.0
sortedcontainers   2.4.0
SQLAlchemy         2.0.27
tblib              3.0.0
thriftpy2          0.4.17
toolz              0.12.1
tornado            6.4
typing_extensions  4.9.0
tzdata             2024.1
urllib3            2.2.0
wheel              0.42.0
xarray             2024.1.1
xyzservices        2023.10.1
zict               3.0.0
zipp               3.17.0

Hi @slepeturin, welcome to Dask community!

Would you be able to provide a minimum reproducer raising this error? At least some code snippet.

As query planning with dask-expr is a recent feature, you are also encourage to give feedback on github, or even open a new issue, but here it still lacks a bit of context I think.

cc @fjetter

Hi @guillaumeeb,
Thank you for the reply. .
We are planning the upgrade to the latest Dask version 2024.3.0 .
Maybe it would solve the issue.
Thank you,
Steve