Hello there! I’m relatively new to Dask, and I’m trying to figure out if it’s the right tool to use for my use case. Hopefully this is a good place to ask my question.
Most of the examples I see on the website are related to processing very large data sets in parallel, mostly executing many small (potentially interdependent) operations. For my current project, I’m trying to perform parameter studies of many simulations (3-90) minutes each that are not memory intensive and can be performed independently, so I’m using Dask to minimize total runtime by fully utilizing system resources (versus running them sequentially) in a simple way. I understand that the above could also be done with the multiprocessing
library, but submitting tasks to the client and monitoring progress from the dashboard is just so easy, letting Dask handle the scheduling and reporting. Is this a common way to use Dask?