How to find count of idle workers from scheduler_info?

bozden · September 12, 2024, 2:35am

This is how it works:

Here I have English and German datasets, and English is about 3x larger than German.
So I have 2 io_bound, 12 cpu_bound tasks, totaling 14 at the start.
When German ends, that extra is also used for cpu_bound tasks (not something that I want, because of context-switching).
Also, as you see some workers have more than one task (at this image 20 total jobs), because not-running-tasks sometimes return 2, it increased. Not a problem thou, they are queued. And I don’t want to queue them all to be able to rescale to idle cores. Would cluster.adapt() be helpful for it?

If I have left it alone, only 6 logical cores would run English chunks - so I “steal workers”. Is worker-stealing for these cases?

Topic		Replies	Views
Scheduler not saturating workers Distributed future , distributed	9	359	August 9, 2023
How does dask schedule to (logical-)cores? Distributed	8	136	September 13, 2024
Dilemma: Schedule IO-Bound / CPU-Bound tasks in cascaded clients Distributed delayed	11	220	September 27, 2024
Memory Management of Dask Cluster and a few new user questions Distributed distributed	15	1619	March 13, 2024
Tasks forgotten waiting for new workers to be allocated Distributed dask-jobqueue , distributed	8	158	June 6, 2025

How to find count of idle workers from scheduler_info?

Related topics