Concurrent futures, slurm, and adapt

M1Sports20 · July 11, 2025, 2:02pm

I have a setup where a python script kicks of a slurm cluster of dasks tasks. It is using the concurrent futures interface. I am using the interface adapt(minimum=1, maximum=500). However, it never scales up past 1 dask node. When I use the scale interface it scales find. I do have long running tasks. Does dask need to wait until the first concurrent task returns before scaling up more? What would cause dask to not scale up the cluster/dask nodes.

guillaumeeb · July 11, 2025, 2:55pm

Hi @M1Sports20, welcome to Dask community!

By default yes, it needs to know how long a single task will last before deciding to autoscale or not.

AFAIK, I think there are two solutions to this:

Specify roughly the duration of a given function your submitting in the config:

import dask
dask.config.set({'distributed.scheduler.default-task-durations.my_function': '1h'})

Specify a long duration for all unknown tasks:

import dask
dask.config.set({'distributed.scheduler.unknown-task-duration': '1h'})

Both should work, report back if it doesnt.

Also be careful with dask-jobqueue and autoscaling, it works better with only one worker process per job!

M1Sports20 · July 11, 2025, 7:26pm

That seemed to work. I did notice issues with autoscalling and dask-jobqueue and did also set it to one a few days ago.

I will say it does still take a little time for it to ramp up and submit more slurm jobs. But it at least works.

Thanks!

Topic		Replies	Views
Let jobs finish when using adapt interface with jobqueues with slurm Distributed distributed	4	85	July 18, 2025
Cluster Scale Up and Down Distributed distributed	2	141	August 26, 2024
Create a (slurm) cluster with different job submission parameters Deploying Dask dask-jobqueue	15	1063	January 19, 2024
Adaptive Scaling while not rerunning non-idempotent tasks Distributed	8	299	December 22, 2023
Correct usage of "cluster.adapt" Distributed	6	1892	April 5, 2022

Concurrent futures, slurm, and adapt

Related topics