Customize Autoscaling using dask's kubernetes operator

sil-lnagel · March 8, 2024, 2:23pm

Hi,
we are currently trying to migrate from a static dask deployment with n workers to an adaptive deployment. For this we are trying to use the new dask kubernetes operator.

When I tried to run some example workload with 1-6 workers it seems to be significantly slower (up to 5x) than using min=3=max=3 workers. It appears that the operator is always trying to scale workers up and down. Now I had another look in the documentation and discovered the Adapative class ( Adaptive deployments — Dask documentation ) which has constructor parameters like target_duration and wait_count. I was wondering if it is possible configure these when using the dask kubernetes operator or if they are only available through the python API.

guillaumeeb · March 8, 2024, 3:56pm

Hi @sil-lnagel, welcome to Dask Discourse forum,

Based on the code source, I don’t think these specific adaptive features are available in dask-kubernetes.

But maybe @jacobtomlinson would prove me wrong?

jacobtomlinson · March 13, 2024, 4:32pm

The adaptive scaling in dask-kubernetes is handles by the controller, rather than client side via th Adaptive class.

If you’re seeing poor autoscaling behaviour could I ask you to open an issue on the GitHub repo with a code example that demonstrates the problem so I can look into it?

sil-lnagel · March 14, 2024, 9:09am

Thanks a lot for your explanation @jacobtomlinson . Regarding the example :it is a bit tricky to create one but I am working on it. If I manage to create one I will add it to my other question Shuffle P2P unstable with adaptive k8s operator?. This ticket also contains a better description of the likely “root cause” of our problems.

Topic		Replies	Views
Kubernetes Operator + AutoScaler Losing Tasks / Workers Deploying Dask	5	593	December 12, 2022
Shuffle P2P unstable with adaptive k8s operator? Distributed kubernetes , scheduler	3	121	March 14, 2024
HelmCluster + Autoscaling doesnt seem to scale the workers Distributed kubernetes	3	29	July 15, 2024
Batched Dask Worker Deployment on Kubernetes Deploying Dask	2	229	June 12, 2023
Change Number of Workers During Runtime Distributed kubernetes , distributed	4	1391	January 17, 2023

Customize Autoscaling using dask's kubernetes operator

Related topics