Performance of Client.submit appears to depend on the size of traceback

elibol · December 2, 2021, 7:47am

See here: distributed/client.py at b1e06a001c6b0b104a4567fc15dd07edd51fb7aa · dask/distributed · GitHub

traceback appears to be transmitted to the scheduler. I’m seeing that, when the caller of submit is a function that has fewer lines of code, my Dask application runs faster. Is this a known issue? Is the traceback required or for debugging only? If it’s the latter, it might make sense to optionally disable transmission of traceback.

ian · December 3, 2021, 8:47pm

Hi @elibol. Yes, the calling code is transmitted to the scheduler when client.[compute, persist, submit] is called. This is primarily used for diagnostic/reporting purposes (associating code snippets with actual tasks that get run).

In principal, if the code snippet is very large, this will increase the overhead of transmitting data to the scheduler. But I’d only expect that to happen if you are submitting many small individual tasks with large code snippets (best practices would suggest that you submit larger task graphs all at once rather than many small ones). And even then, I’d be surprised if it made a big difference in the actual computation time, rather than the overhead of client-scheduler computation.

Are you able to provide a minimal reproducible example demonstrating your slowdown? In particular, I’m curious to see two things:

Whether we can reproduce the issue with many small submits, and whether there is a workaround by submitting larger graphs, and
Whether we can determine if the computations themselves slow down, instead of the client-scheduler communication overhead.

I do think that it could be reasonable to make the transmission of this data optional, just trying to understand how big a problem it is in practice.

elibol · December 6, 2021, 10:58pm

Hi @ian,

Is it possible to submit multiple tasks via Client.submit?

Right now, we’re using Dask as a backend for project NumS (see here for implementation). When we run an ablation comparing various algorithms and schedulers, we’re seeing that Client.submit dominates the execution time within the driver process, and the total execution time of certain benchmarks is determined by how quickly Client.submit executes. For instance, as we increase the number of blocks per array for some micro benchmarks, the impact on performance is amplified. I believe this is a critical overhead to reduce in order to improve Dask’s performance on short-lived tasks. We can look into batching if it’s possible via Client.submit.

gjoseph92 · December 6, 2021, 11:12pm

Can you use Client.map instead? That will be much faster than many individual submissions.

Topic		Replies	Views
Client.submit() only running the code on one of the workers Distributed	3	843	March 22, 2022
Why does submit overhead increase exponentially for Fibonacci example? Distributed	5	273	January 2, 2023
Client Run Function Performance Distributed	1	289	July 11, 2022
Occasional task transition priority assertion error Distributed	4	144	April 22, 2024
One output time vs multiple output time Deploying Dask delayed	1	268	April 19, 2022

Performance of Client.submit appears to depend on the size of traceback

Related topics