@martindurant or someone else.
Can I ask a follow-up to your answer in Question about how to best store self-written functions, when using dask_gateway - Pangeo Cloud Support - Pangeo ?
When using cluster.adapt()
, my question is:
- How can I have each spawned worker automatically import the local pythonfile before starting a task?
- Would it be as simple as, before calling
cluster.adapt()
, to run the following two lines?
from distributed.diagnostics.plugin import UploadFile
client.register_plugin(UploadFile('/home/jovyan/path_to_some_pythonfile.py'))
Our workflow when using cluster.scale(100)
, is to wait for all workers to arrive, and import the py-files directly to every workers with the method client.upload_file('/home/jovyan/path_to_some_pythonfile.py', load=True)
. Now that we are testing out autoscaling with cluster.adapt(min=4, max=100)
, I dont think the upload_file()
-method would suffice, as workers “appear” and “disappear” throughout runtime.
I have read up on using a Built-In Worker Plugin, which if I understand correctly, will tell each worker to do a defined task, before starting any jobs. (For example when it spawns?).