Some questions about SLURMCluster

shambakey1 · March 5, 2023, 7:34am

Hi

Very sorry again for the late reply.

For the memory part, yes it is quit a lot. Strange thing, when I set memory size to something around 10 GB, it starts complaining about memory spilling and worker exceeding %95 of memory budget or something like that. I don’t know why. I didn’t receive these problems when running the distributed scheduler on a local machine.
The LocalCluster is implemented as a distributed dask scheduler running on a single machine with number of processes, and number of threads per process.
For the performance_report when running dask.distributed integrated with SLURMCluster, I activate my Python environment first before running the main code. But as a precaution, I also activate the same python environment in the SLURMCluster configuration to be sure the workers are using the same Python environment.

Regards

Topic		Replies	Views
Difference when starting a SLURM cluster with Conda from SLURM job or from Terminal Deploying Dask distributed	3	64	August 18, 2024
Memory allocation always <= 4GiB for distributed SLURMCluster workers Distributed dask-jobqueue , worker , distributed	8	733	July 12, 2022
Dask-jobqueue and SLURMCluster options Deploying Dask distributed	1	134	March 26, 2024
My code works with LocalCluster but, not with SLURMCluster Distributed	4	454	September 17, 2022
Starting slurm based dask cluster using pre-allocated resources Deploying Dask dask-jobqueue	6	43	May 2, 2025