Distributed performance report of native code

vpadulan · July 31, 2024, 3:28pm

Hi,

I have been recently very happily using the memray plugin for dask distributed (API — Dask.distributed 2024.7.1 documentation). The most important and crucial feature of this plugin, or better said of the underlying tool memray, is that it keeps track of memory allocated by native code. My application spends the vast majority of its runtime in C and C++ code, so any type of metrics gathered should be able to track the libraries used by the thin Python layer.

In this regard, I am now looking for an equivalent solution for performance profiling. I am aware of performance_report, but it is severly lacking in this aspect because it cannot keep track of native code run by the Dask worker (or at least I couldn’t find any option to enable this behaviour).

So my question is: is there a way to have a nice report of runtime spent by a Dask worker running a mostly C and C++ application, such as the memory usage report given by the memray plugin?

Thanks!
Vincenzo

guillaumeeb · July 31, 2024, 3:48pm

Hi @vpadulan,

I’m not yet familliar with the dask memray plugin, so I’m not sure I understand your question: why this plugin is not enough for what you want to do?

vpadulan · August 1, 2024, 6:39am

Dear @guillaumeeb ,

Thanks for your prompt reply!

The Dask memray plugin uses GitHub - bloomberg/memray: Memray is a memory profiler for Python to gather a memory profile of the application from all the workers. What I am looking for instead is a runtime profiling, i.e. what is given by the dask performance_report function. But that only covers Python code, whereas what I’m looking for is something akin to the output of perf + flamegraph, i.e. what is also achieved via Python support for the Linux perf profiler — Python 3.12.4 documentation . But clearly I would need a Dask support for a plugin that automatically runs the profiling on the worker process and creates a flamegraph out of it.

Cheers,
Vincenzo

guillaumeeb · August 1, 2024, 2:25pm

OK, not an expert here. I don’t think that exist, but you can look at memray implementation or GitHub - gjoseph92/dask-pyspy: Profile the dask distributed scheduler with py-spy and viztracer to see how it has been done, and maybe propose some tool?

Topic		Replies	Views
Tracking and Storing Memory Usage Per Task Distributed distributed , performance	6	40	July 31, 2024
Measuring the overall profile of long runs Distributed	17	432	May 22, 2024
API to access diagnose dashboard data	2	184	May 11, 2022
Persistent memory profiling/logging Distributed logging	7	63	June 16, 2025
Dask distributed callbacks Distributed	6	634	August 11, 2022

Distributed performance report of native code

Related topics