Hi @nelsongriff and welcome to discourse!
Since you’re running this locally, one improvement could be to use the threaded scheduler and avoid the cost of transferring data between tasks which occurs when using the multiprocessing or distributed schedulers. Here’s an example:
import dask
data = {'key_1': {'data': {'blah': 'value'}}, 'key_2': {'more_data': {'blah_blah': 'values'}}}
delayed_data = dask.delayed(data)
parameters = [{"a": 1}, {"a": 2}]
def run_simulation(data, param):
return id(data)
simulations = []
for params in parameters:
simulations.append(dask.delayed(run_simulation)(delayed_data, params))
# results will be the same, as opposed to scheduler = "processes"
results = dask.compute(*simulations, scheduler="threads")
The downside to the threaded scheduler is that depending on the work being done in run_simulation
, you could encounter issues with Python’s Global Interpreter Lock (GIL), which means you won’t be taking advantage of any parallelism. Can you elaborate a bit on what happens in run_simluation
?
How big is the dictionary and could it be partitioned? If it can be partitioned, then you would be able to load the partitions in different processes locally (so you’re not GIL-bound) or even on different workers in a cluster, which would be a good option.
If you’re unable to partition the data and run_simulation
requires the GIL, then another option would be to use a hosted cloud provider with access to more memory.