I’m trying to understand the correct way to set up a Dask cluster across remote machines and deploy multiple Dask clients to it, where each client has its own set of Python modules.
For example, I have a project structured like this:
dask-app/
├── poetry.lock
├── pyproject.toml
├── src/
│ └── dask_app/
│ ├── __init__.py
│ ├── main.py # entry point
│ ├── my_module/
│ │ ├── __init__.py
│ │ ├── config.py
│ │ ├── dask_connector.py
│ │ └── operations.py # computation logic
│ └── utils/
│ ├── __init__.py
│ ├── logger.py
│ └── data_utils.py
and main.py is as follows:
from dask_app.my_module.config import load_config
from dask_app.my_module.dask_connector import get_dask_client
from dask_app.my_module.operations import process_data
def main():
config = load_config()
# Connect to cluster
with get_dask_client(config) as client:
print(f"Connected to cluster: {client.cluster}")
# Perform computations
result = process_data(client, config)
print(f"Computation result: {result}")
if __name__ == "__main__":
main()
What’s the recommended way to deploy such an app to the cluster without restarting the entire Dask cluster every time I update local code?
I’ve come across client.upload_file()
but haven’t found a complete example showing how to use it effectively for a project like this, especially when multiple clients with different logic run simultaneously.