Scheduler sends same task to many workers

It seems like there is a race in dask that leads to the same task being offered to more than one workers. This leads to fatal bugs when an ‘impossible’ transition results, e.g.:
RuntimeError: Task ‘xyz’ transitioned from processing to memory on worker <WorkerState ‘tcp://XXX’, name: XXX, status: running, memory: 555, processing: 1>, while it was expected from <WorkerState ‘tcp://YYY’, name: YYY, status: init, memory: 0, processing: 1>. This should be impossible.

Is there any known work around for this?

Also, we’re running a pretty old version of dask (2022.11.1); Has this been fixed in a more recent version?

Hi @smuirsmi,

From the top of my head I can’t recall any bug fixes that happened since your version about the issue you described. I may be wrong though and it would really help if you could reproduce the problem on the latest available version of dask. In the latest version, there aren’t any known issues that can cause what you describe.

If you can still reproduce the issue on the latest version, the second thing we’d need to investigate this is a cluster dump: API — Dask.distributed 2023.8.1 documentation