I’ve been using notebooks with dask clusters and I wanted to start cleaning up my code and create modules (not so I can create libraries, but so I can pick apart my code into cleaner sections so I can rebuild it up from the modules and create a small version meant for export).
I can import my modules locally, but when I try to run a dask job that relies on a function from an imported module, it breaks.
I put my modules in a folder called imports
notebook.ipynb
\imports
functions.py
imports.py
constants.py
I’ve read a few posts and tried a few things (such as pythonpath and running everything from the same location). Tried creating a symbolic link from the root folder to where I have imports in a subfolder.
What is the best practice to load custom modules on a dask cluster? Is it importing a library the proper way (i.e. into the libraries folder of the venv)?
It seems like it should be a simple fix. If I hand code the function it works, but if I put it in a file for import, it doesn’t work in dask (but will work locally). I know it’s a namespace issue, but even if I put the same folder imports where it should be on the remote clients, still throws an error about a missing module.