I’m looking into a way to get the benefits of the Futures API (namely, a dynamic DAG), combined with a custom optimizer which can skip tasks that aren’t needed. For the other APIs, it seems you can do this by setting the X_optimize
config option as described here. However, since I’m using futures that I client.submit()
, I’m not sure how if I can apply an optimizer to this, or if it even makes sense to have an optimizer (since it’s a dynamic DAG). If there is not optimizer hook for this case, is there some other hook I can leverage to optimize each Future
before it runs? Or will I need to do it using a decorator on each function I submit.
@multimeric Good question! This functionality doesn’t currently exist for the Futures API. Futures is a pretty low-level API and optimization is a comparatively higher-level operation. We can also think of Futures as a manual way of setting up parallel computations, in which case, we usually use cancel
to manually delete any Futures.
However, I think this feature might make sense (even though I’m not sure what the API would look like), so please feel free to open a feature request!
Thanks for the answer. I didn’t realise that Futures were considered lower level, actually. This part of the docs made me think that they were higher level:
This interface is good for arbitrary task scheduling like dask.delayed, but is immediate rather than lazy, which provides some more flexibility in situations where the computations may evolve over time.
GitHub issue is here: Optimizer hook for Futures API · Issue #9228 · dask/dask · GitHub.