SLURMCluster on 64 nodes / Understanding Cluster scale method

Hi, I am trying to start a heavy task on 64 nodes each with 32 cores. In particular, I want to have 64 dask workers to perform my task. Each worker to have 32 cores.

With my current setup I am always requesting 2048 workers, which I don’t want. I realized that I don’t want to to have 2048 workers as I was including breakpoints and I was monitoring len(cluster.scheduler.workers) and it never reached more than ~1400 workers in various attempts with long wait times.

My goal is to have only 64 (dask) workers and 2048 nannies. But I fail to establish such a setup. I guess, I will produce a single log file then.

My current cluster and client are:

cluster = SLURMCluster(memory='256g', processes=32, cores=32, queue='photon')
client = Client(cluster)
cluster.scale(jobs = 64)

This produces multiple log files.

I tried to add to the slurm cluster constructor n_workers = 64, but it made no change.

I partially solved the problem by passing cluster.scale(n=64) in stead, but this lead to exremely slow computation. In this case I had 64 workers, but also only 64 nannies (found by grepping “Start Nanny/workers” in the produced output logs). This produced a single log file.

Other observation, which might be obvious: I cannot specify more than 32 cores in the cluster constructor.

I hope I made my problem clear. My general goal is to efficiently scale distributed computation on 64 nodes (not less). To my understanding proper number of workers and nannies should be the solution in the right direction. (I am confident that the data frame that I am using is capable of doing distributed computations efficiently, and problems are not coming from there.)

A subquestion, that would be more important for me first is: how can I start a job with 32 nannies and 1 worker?

The parameters that you give to the constructor define a single job, so it sounds like you want …

cluster = SLURMCluster(processes=1, cores=32, ...)
cluster.scale(jobs=64)
1 Like

To be really clear: this is impossible. 1 worker is always having one nanny.

To explain a little more, Dask is a bit different from Spark, or Python is different from Java. Every processes you launch is a worker, there is no such thing as workers with 32 processes.

The closest to what you want is what @mrocklin proposed, but here you’ll got 64 workers with 32 threads each. Depending on your computation, this might not be what you want. But I encourage you to try though!

2 Likes

Thank you both @guillaumeeb @mrocklin . I understand that my initial plan was not realistic.

But I still could not run on 64 nodes.

With this setup I had N number of job scripts, where N is the number of nodes.

I further added job arrays, so I now run on 32 nodes with 1 job script (again using SLURMCluster). I am waiting for 64 free nodes to check if the problem comes from number of jobs, or it is rather something else (potentially internal).

I will report once I have results.