Package upload - best practice

I am developing a code consisting of a huge package tree. It works as expected when I run locally or when I submit to dask-scheduler to distribute across my cluster.
I have a script that “compiles” the python code in an egg file (remove old egg && python bdist_egg) every time I run with the most recent code and upload it to the client.

At the beginning of the code, I run:

    client = Client(address=config["dask_server"], name="project-client")
    filename = config["dist_file"]
    client.register_worker_plugin(UploadFile(filename), name="egg-package")
    # registering twice while I can't figure out a better way
    client.register_worker_plugin(UploadFile(filename), name="egg-package")

The workers are already running on the cluster through the dask-worker command.

This process works, but it is annoying.
If the egg file changes, the worker says “bad file header,” and I have to kill the workers, delete the temp dir and start the workers again. Then, the same egg file it complained about works.

Is there a better way to handle this process? If the package was done, I could install it on my worker nodes, but the egg will frequently change.

I tried to find a way to clean the worker temp directory when the client connects but could not find anything in the forums or documentation. Also, if I retire the workers, the dask-worker process dies in the node machines, and I need to connect manually.

I feel I am missing something silly and getting bogged down.

Hi! Sorry for the delay in replying. This is a somewhat complicated situation, so what I’m suggesting may not work for you. That said, the problem is that Python needs to re-install the egg when it’s updated. My solution has 2 parts:

  1. Run pip install -e on the workers
  2. Update the code in the path where pip installed from.

The first step shouldn’t be too hard, since you’re already doing it. The second step is a little trickier… I was thinking possibly git pushing to a (private?) repo, you can either do a webhook or have the workers poll for updates and git pull the code.

I don’t know if this is a good idea or not, but it should resolve the egg errors you’re getting.

I could rsync with all workers and ensure they have the same code at the run time. I will give it a try later in the week.