Batched Dask Worker Deployment on Kubernetes

Hello friends,

In Spark on Kubernetes, there is an option to set a batch size for the number of workers that are spun up simultaneously within a larger job. An example of this would be with 100 workers and batch size of 10, the distributed scheduler will spin up 10 gangs of 10 workers each sequentially to lessen load on the kubernetes scheduler as it provisions each gang.

I am wondering if similar functionality exists anywhere in Dask kubeCluster or if there is an easy way to trick the Dask scheduler into waiting a short period of time between pods as another option rather than dumping all of the workers on the cluster at once.

Once we exceed ~40 workers at once, we have issues with our IAM credential provisioner in AWS, and addressing this on the cluster side would make it easier instead of needing other workarounds if it’s possible.

Thanks! -DH

1 Like

We dont but this sounds like a great idea. Would you be able to open an issue on dask-kubernetes to propose this?

1 Like

Will do. - Thanks for the quick reply. Add batched worker provisioning to Dask Cluster spawning on Kubernetes · Issue #733 · dask/dask-kubernetes · GitHub to close the loop.