API to access diagnose dashboard data

Good afternoon, friends!
We are trying to measure the max memory consumption on each worker over time. (during the execution of certain graph). We tried https://github.com/itamarst/dask-memusage and our homegrown solution using client.submit(). But then I’m wondering it might be impact to the performance because of the frequency. I see we are already reporting bytes stored and bytes stored per worker on the dashboard, so there must be similar things going on already.

Im just wondering if it’s possible to get access to the data used by the dashboard programmatically? If so, how?

Thanks in advance.
Screen Shot 2022-04-28 at 3.24.54 PM

@ubw218 Perhaps you’re looking for scheduler.memory?

from dask.distributed import LocalCluster, Client

cluster = LocalCluster()
client = Client(cluster)

print(cluster.scheduler.memory) 

# Or, individually

print(cluster.scheduler.memory.process)
print(cluster.scheduler.memory.managed_in_memory)
print(cluster.scheduler.memory.unmanaged_old)
print(cluster.scheduler.memory.unmanaged_recent)
print(cluster.scheduler.memory.managed_spilled)

Thanks @pavithraes! I’ll give that a try

1 Like