In my workflow I’m making heavy use of DataFrame.apply
, in multiple different contexts. The issue is that when I look at the dashboard, it’s not clear which of these is actually running, because the tasks are all called “apply”.
e.g. on the Workers page:
Is there any way to tell Dask that I want each apply
call to have a custom name, for identifiability?
I was looking for something similar using the futures API. In client.submit
it is possible to specify a key for each future but I have not seen it is used in the dashboard (at least not in the nice task charts :-))
1 Like
The trouble is I’m not sure how to manipulate the futures when using the DataFrame API because it kind of abstracts all that away.
I am not even sure that it is possible to manually set a key for operation on data frame (couldn’t find a way either) - but sort of piggy-backing on the thread
Is there any way to tell Dask that I want each apply
call to have a custom name, for identifiability?
@multimeric I don’t think you can customize this in the high-level collections like DataFrame and Array. (Only the lower-level Delayed and Futures APIs allow this, as @tomercagan
suggested.)
Okay, I might submit this as a feature request on GitHub in that case.
1 Like
@multimeric Thanks for opening the issue!