Hi @akhmerov, welcome!
I agree that the documentation around distributed.Futures
is pretty confusing, and it should be improved. To me, the important differences (and I’m editorializing a bit here) are:
Future
s are mostly a reimplementation of theconcurrent.futures
API. It’s mostly a different API from the dask collections API (e.g.,dask.dataframe
ordask.array
).Future
s represent an unrealized result on adistributed
cluster. They are defined indistributed
, and almost entirely absent from thedask/dask
codebase (I did a quick grep through, and only found one conditional import of them).
I said above that it’s “mostly a different API”, but, of course, you identified some exceptions . I think one of the things that is weird about the distributed.Client
interface is that it implements two APIs at the same time: (1) concurrent.futures
and (2) dask.{compute, persist}
, and there are some places where they can be mixed. I generally think that most people should stick with the standard dask collections/task-graph API until they need to do something trickier with concurrency on the client. But opinions may differ there.
This is true if your client is asynchronous, but if you are using the default synchronous client, doesn’t it return the result of the computation directly?