Different scheduling for dask delayed and dask futures?

I have a question regarding the differences between delayed and futures. The basics (lazy vs immediate) are clear for me.

Let’s say we would like to read several files into an array and compute something. I structure it into several tasks and trigger the computation. I could trigger the computation using dask delayed, building a task graph.

On the other hand, I could use futures. I would use the same tasks and trigger them with futures.
Tasks depending on results from earlier tasks are connected using their futures so that we don’t copy the data to the main process. Doing it that way, we can build something like a task graph with futures as well (at least to my understanding).

Is there a main difference in the scheduling of the tasks if I either use delayed or futures?

Thank you for your help.

@Chris Welcome to Discourse and thanks for this question!

Doing it that way, we can build something like a task graph with futures as well (at least to my understanding).

You’re right about this.

I think of Delayed and Futures as distinct APIs, especially because Futures is bundled with the Distributed scheduler and gives finer control over some aspects of distributed computations, whereas Delayed is best suited for single-machine threaded computations.

I also find this answer by Ian rose quite nice, you may find it helpful too: Documentation on the interplay between graphs and futures

1 Like