Is there a way to specify the priority of a task with dask.delayed
? I have a small number of independent tasks that I know will take much longer than the rest of my tasks so I want to start them first, before the faster tasks.
Hi @bcaddy, welcome to Dask community!
Did you try the mechanism explained in the docs?
https://distributed.dask.org/en/stable/priority.html
How exactly? That pages says nothing about dask.delayed
and the API page for dask.delayed
and dask.compute
don’t mention the priority
kwarg.
A user defined priority is provided by the
priority=
keyword argument to functions likecompute()
,persist()
,submit()
, ormap()
So you can give the priority
kward to dask.compute, that accepts extra kwargs. Client.compute() API is mentionning it.
You can also use Dask annotations for any collections, as mentionned in the documentation page.
Does priority take a list in compute? I have several thousand tasks that I call compute on but I only want to prioritize about 7.
With the annotations. Do I put the ‘with’ block around the compute call or the Dask.delayed call?
Well, I guess you need to be able to make separate calls to compute
depending on the priority of your tasks.
Do you have some reproducer?
Won’t it wait on the first call to compute to finish before moving on to the next? I’ve only got 7 long running tasks but 256 processes.
Do you use a Distributed cluster? If so, you can use Client.compute
.
I do and that should do it. Thank you
It would be really nice if dask.delayed
supported setting the priority directly; the ergonomics would be much better.