What is the pros/cons of using Futures/Delayed?

bcaglaraydin · December 15, 2022, 2:37pm

Hi,

I’m working on a parallelization task which requires low level customization.

I think I understand both Delayed and Future concepts, but I have a hard time deciding which one to use in which situation.

I would appreciate it if you could show two scenarios where one is more suitable than the other. (Or how to efficiently use both together).

Thank you,

scharlottej13 · December 16, 2022, 7:18pm

Hi @bcaglaraydin and welcome!

This is a hard question to answer without knowing more about what how you’d like to use Dask. Mind sharing a bit more about what you’d like to do and how you’re deploying Dask (e.g. locally, HPC cluster, in the cloud)?

In addition to the Dask documentation on the Delayed and Futures APIs, this explanation from Ian Rose might help:

bcaglaraydin · January 6, 2023, 2:53pm

Hello! Thank you for your answer,

I will deploy Dask on Kubernetes, and using both delayed and futures objects at the moment.

This is an example of how I am trying to parallelize:

for col in columns:
             missing_frequencies.append(dask_df.map_partitions(get_frequencies, col))
#get_frequencies is a delayed method

missing_frequencies_futures = client.compute(missing_frequencies)
missing_frequiencies_result = client.gather(missing_frequencies_futures)

I am not sure if that is a good practice, I would greatly appreciate if you can help here. Thank you

guillaumeeb · January 15, 2023, 5:10pm

Hi there,

My motto here is : “Use Delayed when you can, fallback to Future when you need”.

I find Delayed much more elegant and simple for non real-time work, for graph or workflows you can define from start to end without needing to inspect part of the result at some point.
But sometimes you just need Future API, it just kind of seem wrong to me for some workflows.

Topic		Replies	Views
Different scheduling for dask delayed and dask futures? Distributed delayed , future , scheduler	1	648	April 19, 2022
Documentation on the interplay between graphs and futures Distributed	5	821	January 24, 2022
Using futures and iterrows - optimal? Dask DataFrame delayed , future	3	108	May 16, 2024
Question: if I am mixing dask.delayed functions and using dask dataframes, are there any caveats to be aware of? Dask DataFrame delayed	5	727	August 21, 2023
Dask.delayed and custom workflows Distributed kubernetes , delayed , distributed	2	293	February 21, 2024

What is the pros/cons of using Futures/Delayed?

Related topics