How to use client.compute() with sync=True

I’m doing local asynchronous testing before setting up the code to be run on cluster. With client = Client(..., asynchronous=True) computing tasks need to be done with client.compute() which has a sync parameter. However, I don’t understand the code’s behavior when this parameter is set to True, as the following example shows:

async def main():
    from dask.distributed import Client
    import dask

    is_async = False
    client = Client(threads_per_worker=12, n_workers=1, asynchronous=is_async)

    def compute_partial(i) -> set[int]:
        return {i}

    tasks = [dask.delayed(compute_partial)(i) for i in range(2)]
    task = tasks[0]
    if is_async:
        print(await client.compute(tasks, sync=True))
        print(await client.compute(task, sync=True))
    else:
        print(client.compute(tasks, sync=True))
        print(client.compute(task, sync=True))


if __name__ == '__main__':
    import asyncio
    asyncio.run(main())

I expect the code to run fine regardless of the value of is_async. When is_async=False it does run without issues, but when is_async=True the second print statement crashes the code with 'coroutine' object is not iterable.

Anybody can help explain why this is the case? Thanks.

1 Like

Hi @Tianming_Han, welcome to Dask community!

Not an expert in async Python programming here. But a few hints.

As said in the documentation:

Dask’s normal .compute() methods are synchronous, meaning that they block the interpreter until they complete.

I think the correct way of doing things asynchronously is described here, or here.

1 Like