Portable Workflows: Specifying the cluster class via config

multimeric · September 4, 2022, 9:10am

In production I imagine most users will be using dask_jobqueue or dask_cloudprovider rather than the standard LocalCluster. However, while both of these libraries can be configured using the usual dask.config mechanism, you can’t actually choose the cluster type using the config, you have to actually edit your code to add in a SLURMCluster() or FargateCluster() into your code, which instantly makes the workflow non-portable. By this I mean that I want the exact same codebase to be runnable by an HPC user and a cloud user, without them or I having to edit the actual Python code. If you could specify the cluster type using the dask config, then this would be a non-issue, but this doesn’t seem to be the case.

What is the best solution here to allow my workflows to retain portability? Is there a mechanism for using the dask config to choose the cluster?

jacobtomlinson · September 7, 2022, 1:58pm

This is something we are trying to address with dask-ctl.

Topic		Replies	Views
Dask config - how does it actually work? Distributed kubernetes , distributed	7	199	September 4, 2024
Difference when starting a SLURM cluster with Conda from SLURM job or from Terminal Deploying Dask distributed	3	65	August 18, 2024
Configuring Dask to target GPU or CPU workers in a heterogeneous cluster Distributed distributed	1	20	August 6, 2025
Heterogeneous clusters (Kubernetes + HPC workers) with Dask Gateway? Distributed dask-gateway , kubernetes , distributed	1	313	May 20, 2022
Creating cluster from windows Distributed distributed	3	241	March 31, 2022

Portable Workflows: Specifying the cluster class via config

Related topics