I am extremely new to dask, I just finished writing my first implementation of it in a script, so my apologies in advance.
I have a collection of python scripts running on a large/university network to generate images from meteorological data, then transfer those images to a public website. These scripts are only sort of parallelized, by using multiple systemd services and running separate python processes. There is no monitoring of how many threads are active, memory usage, etc. and the error logging isn’t great either. I’d like to switch to using a dask scheduler to manage everything, where the scripts just submit tasks to the workers, and then I can use the dashboard to look through exceptions to detect and fix bugs.
I can fairly easily configure my backed server to allow connections to port 8787 to only traffic that’s on network. This will not be on the public internet, so I’m not too worried about people trying to exploit potential vulnerabilities in dask’s dashboard, and I think it would be cool to have a page that shows the real-time status of the python scripts’ progress (and the tracebacks! motivated students could probably resolve a lot of the errors in my code…) However, there are a few concerns I have with this:
- Is the dask status dashboard only a status monitoring page, or are there controls buried in there somewhere that can send commands to the cluster/scheduler/workers?
- Is it possible to password-protect the dask dashboard? Especially if the above is true, I want to allow some access to the dashboard, but not unlimited access.
Thanks!