How to use Built-In WorkerPlugin to import code when worker spawns

@martindurant or someone else.
Can I ask a follow-up to your answer in Question about how to best store self-written functions, when using dask_gateway - Pangeo Cloud Support - Pangeo ?

When using cluster.adapt(), my question is:

  • How can I have each spawned worker automatically import the local pythonfile before starting a task?
  • Would it be as simple as, before calling cluster.adapt(), to run the following two lines?
from distributed.diagnostics.plugin import UploadFile
client.register_plugin(UploadFile('/home/jovyan/path_to_some_pythonfile.py'))

Our workflow when using cluster.scale(100), is to wait for all workers to arrive, and import the py-files directly to every workers with the method client.upload_file('/home/jovyan/path_to_some_pythonfile.py', load=True). Now that we are testing out autoscaling with cluster.adapt(min=4, max=100), I dont think the upload_file()-method would suffice, as workers “appear” and “disappear” throughout runtime.

I have read up on using a Built-In Worker Plugin, which if I understand correctly, will tell each worker to do a defined task, before starting any jobs. (For example when it spawns?).

Hi @ofk123,

Well, it should, and if it does not work, then this is probably a bug!

You shouldn’t have to wait for all workers to arrive, and should be able to register the plugin from the beginning!

See the doc (that you mentionned):

setup(worker: Worker ) → None | Awaitable[None][source]

Run when the plugin is attached to a worker. This happens when the plugin is registered and attached to existing workers, or when a worker is created after the plugin has been registered.

UploadFile plugin uses setup function.