Prevent dask from processing more than one future per each worker on LSF Cluster

I am using dask on LSF cluster to process some images in parallel. The processing function itself uses joblib to perform multiple computations on the image in parallel.

It seems that setting n_workers and cores parameters to some numbers will generally produce n_workers * cores futures running at the same time. I would like to have n_workers futures being processesed at a time, each of them having cores cores at disposal for the purpose of using them with joblib.

How do I achieve such result?

Hi @damiankucharski,

In order to have only n_workers processes running at the same time, I see two solutions:

However, I’m not sure using joblib from inside dask workers will work well. I would recommend only using Dask, or using joblib on top of Dask as this is done by Scikit Kearn. But I understand that you want one image to be processed on a unique worker which may not be that simple with what I suggest.

2 Likes

Hello @guillaumeeb, thank you for your answer. Could you please provide me with example as for the first point?
Also, I am not sure what you mean by “using joblib on top of Dask as this is done by Scikit-learn”. Does not sklearn just simply use joblib in the most straightforward way to run computations on multiple cores?

OK, so I’m used to PBSCluster, for LSFCluster, you want to use ncpus instead of resources_spec. You should do something like:

cluster = LSFCluster(cores=1, memory='32GiB', ncpus=8)

This would give you a worker with one process and one thread, but in a job that has bookd 8 cpus.

See dask_jobqueue.LSFCluster — Dask-jobqueue 0.7.4+11.g96e39da.dirty documentation for more options.

For the second point, Dask can be used as a backend of joblib, so joblib sends tasks to a Dask cluster instead of just doing multiprocessing. See a simple example here: Using dask distributed for single-machine parallel computing — joblib 1.2.0.dev0 documentation.
My sentence about Sickit-Learn was a bit wrong, Sickit-Learn uses joblib, and there are lots of example on how to use Dask joblib backend with Scikit Learn.

2 Likes

Oh, I did not realize that you can use both cores and ncpus argument. I thought that these are basically the same parameter. Thank you very much for all your help.