@jcfaracco Good question!
Dask’s low-level collections, Delayed and Futures, allow you to specify labels:
- Delayed: using the
dask_key_name
keyword argument, docs: API — Dask documentation - Futures: using the
key
parameter inclient.submit
andclient.map
This isn’t implemented yet for high-level collections (like Dask Array in your example) though, here’s the open issue: Mechanism for naming tasks generated by high level collections · Issue #9047 · dask/dask · GitHub
A workaround could be to rewrite your code using low-level collections:
import numpy as np
from dask import delayed
@delayed
def my_mean(block):
return block.mean()
my_cond = True
arr = np.random.random(100000)
mean = my_mean(arr, dask_key_name="mean1")
if my_cond:
mean = my_mean(arr, dask_key_name="mean2")
mean # Delayed('mean2')
In this case, please be careful to not mix collections!