Portable Workflows: Specifying the cluster class via config

In production I imagine most users will be using dask_jobqueue or dask_cloudprovider rather than the standard LocalCluster. However, while both of these libraries can be configured using the usual dask.config mechanism, you can’t actually choose the cluster type using the config, you have to actually edit your code to add in a SLURMCluster() or FargateCluster() into your code, which instantly makes the workflow non-portable. By this I mean that I want the exact same codebase to be runnable by an HPC user and a cloud user, without them or I having to edit the actual Python code. If you could specify the cluster type using the dask config, then this would be a non-issue, but this doesn’t seem to be the case.

What is the best solution here to allow my workflows to retain portability? Is there a mechanism for using the dask config to choose the cluster?

This is something we are trying to address with dask-ctl.

1 Like