Hi!
I’d like to set up a local cluster that has CPU_COUNT workers (with one thread each), where exactly one of the workers has a resource, e.g. “GPU=1”, assigned. To my understanding, if running dask-worker <scheduler> --nworkers auto --resources="GPU=1"
, the resource label is attached to all worker processes. Is it possible to assign it to only one worker?
Thanks a lot in advance!
@hayi Thanks for this question!
I believe you can start the one worker with special resources separately:
dask-worker <scheduler> --nthreads 1 --resources "GPU=1"
then start the rest as you’ve shown:
dask-worker <scheduler> --nworkers auto --nthreads 1
Reference documentation: Worker Resources — Dask.distributed 2022.8.1+6.gc15a10e8 documentation
Does this help?
Thanks @pavithraes for the quick reply!
This is getting closer to what I’m aiming at, but not quite there yet While the first command spins up the resource worker, the second command spins up another 4 workers (in total 5 workers instead of 4).
Is there a way to tell the second command to spin up “CPU_COUNT minus 1” workers?
Thanks!
@hayi Thanks for your patience! You’re right that “auto” wouldn’t help here. Not only will it create cpu_count number of workers, but it’ll also override the nthread=1
command.
I don’t think we can directly create “CPU_COUNT minus 1” number of workers. You’ll need to find the CPU_COUNT for your machine, and maybe save it as an environment variable, and then use it.
I’ve opened an issue related to this: Better description and warnings for using `n_workers=auto` · Issue #6097 · dask/distributed · GitHub