Submit MPI job to a single dask worker spanning across multiple nodes

Hi, I have a quick question, is it possible to span a single dask worker across multiple nodes and embed MPI for communication between nodes? I need to handle a problem with many independent tasks and each task needs memory from more than two machines with data communication in between. I am hoping still using MPI for data communication, and use dask workers (SGECluster with adaptive(), I really like the adaptive/fault-tolerance feature, compared to dask-mpi approach where slots are fixed) to handle those independent tasks. But I don’t know how to submit an MPI job to a single dask worker through submit()? Is it do-able and do you have an example? Or alternative solution is also appreciated. Thanks in advance.

Hi @llodds,

I’m not sure I understandn your question. A Dask Worker is a single process running on a single node. It cannot handle memory from several nodes. Unless you want to submit MPI applications from a Dask Worker?

Couldn’t you just use job arrays if your tasks are independent?

@guillaumeeb

Yes I want to submit MPI application from a single dask worker. The MPI application need to span across multiple nodes due to memory requirement. Results from those tasks are further integrated into other complicated scientific workflow (iterations of gradient update, each gradient update is made of combining results from multiple independent tasks) which I hope I will built using Python not bash with job arrays.

From what I understand, this sounds a bit complicated. You want to launch a Dask Cluster using dask-jobqueue, submiting jobs to start Workers, and you submit tasks to this cluster which will start MPI jobs through SGE?

This might be doable, but I’m not sure of the use of the Dask Cluster, and this will involve a lot of spawning processes to start the jobs.

Couldn’t you just use a normal Python interpreter without Dask to submit the MPI jobs and retrieve the results?

Maybe with some code sample this would be clearer.