Hi, new to dask here. I’m working on loading a very large dataset through a dask cluster through dask-cloudprovider. I’ve been able to set everything up so that it works but when I try to compute anything on my data I run into issues with the environments not matching. I’ve been able to use plugin.PipInstall to install most of the relevant packages but one of my packages requires me to install drivers on all workers. This would be easy with docker, but my company uses Azure and also does not use docker. Are there any ways I can coordinate my environments without docker that will let me install drivers on my workers?
Hi @a_sad_elm, welcome to Dask community!
When you talk about drivers, you mean on system side?
How do you build your Cluster using Dask cloudprovider? Why are the environment not matching?
If you don’t want to use Docker, then you’ll probably have to build a template for your virtual machines.