Hello everybody again,
In my code I generate a huge task graph to parallelize a data stream. Unfortunately, this code has a lots of conditionals that are hard to debug which part is related to the generated task graph. So my question is really simple. Is it possible to create a label for some specific part of the task graph?
def my_mean(block): return block.mean() if __name__ == '__main__': my_cond = True dask_array = dask.array.random.random(100000) mean = my_mean(dask_array) if my_cond: mean = my_mean(mean) mean.compute()
Notice that the task graph is dependent of
my_cond variable and it is not associated to the
dask_array values itself.
What a would like to do (or something similar) is:
def my_mean(block, label): with DaskTaskLabel(label): return block.mean() if __name__ == '__main__': my_cond = True dask_array = dask.array.random.random(100000) mean = my_mean(dask_array, "mean1") if my_cond: mean = my_mean(mean, "mean2") mean.compute()
So, I could check exactly when that
mean() function was called and who called. I can even filter it by specific labels. This would be an interesting thing for huge task streams like mine.