Extracting log data from distributed performance report

With Dask Distributed performance report we can conveniently log/stream performance metrics to a HTML file:

from dask.distributed import performance_report
with performance_report(filename="dask-report.html"):
    # some dask computation
    result  = dask.compute(*tasks) 

The HTML file contains Bokeh visualizations of the different metrics; for example:

However, now I need to combine these metrics with data from another experiment, so I need to access the underlying data that has been used to make these plots. How can we access the logged performance metrics? Or how can we log the streamed performance metrics?

Hi,

I looked into this a bit, but did not find anything that was available out of the box…

Maybe it’s possible using Bokeh to save the data as Json (as it is stored as Json in the HTML file), but I’m not sure how.

Another possibility would be to use Prometheus metrics, but again, you’ll have to setup something…