PYTHONPATH setup

Hi,
I used NFS to keep some customed code so that each node can access those code. To be flexible, I would like to submit a batch of jobs with on-the-fly PYTHONPATH so that different users can use different code base on NFS, e.g., /data/nfs/james/project1 and /data/nfs/mike/project1. Is there a way to pass environmental variables for each batch of jobs on the fly?

-Jackie

Hi Jackie, thanks for the question! You can use Nanny Plugins to set environment variables for worker processes. There is also the UploadFile Worker Plugin, which allows you to upload a local file to the workers. These can be set via the config to create default settings for different users. Hope that helps!

Hi Scharlottej,
It is interesting that we revisit Dask in a year. The UploadFile Worker Plugin approach seems more reasonable than sharing code through NFS. In short, I would like to submit some jobs to Dask with dependencies(a project) on the fly. In other words, the dependencies should be only visible to the jobs with which they submitted along. I tried the client.upload_file approach that programmatically makes the local project a wheel and updates it before submitting a job. However, I noticed an error message “zipimport.ZipImportError: bad local file header: ‘/tmp/dask-worker-space/worker-ic9bri0d/deepsea_core-2.0.5-py3.9.egg’” when the file is re-uploaded. Looks like the file is on the dask worker permanently, which defeats the purpose that submitting jobs with dependencies on the fly. Please correct me if I use the UploadFile feature incorrectly.