Hi,
I have been recently very happily using the memray plugin for dask distributed (API — Dask.distributed 2024.7.1 documentation). The most important and crucial feature of this plugin, or better said of the underlying tool memray, is that it keeps track of memory allocated by native code. My application spends the vast majority of its runtime in C and C++ code, so any type of metrics gathered should be able to track the libraries used by the thin Python layer.
In this regard, I am now looking for an equivalent solution for performance profiling. I am aware of performance_report, but it is severly lacking in this aspect because it cannot keep track of native code run by the Dask worker (or at least I couldn’t find any option to enable this behaviour).
So my question is: is there a way to have a nice report of runtime spent by a Dask worker running a mostly C and C++ application, such as the memory usage report given by the memray plugin?
Thanks!
Vincenzo
Hi @vpadulan,
I’m not yet familliar with the dask memray plugin, so I’m not sure I understand your question: why this plugin is not enough for what you want to do?
Dear @guillaumeeb ,
Thanks for your prompt reply!
The Dask memray plugin uses GitHub - bloomberg/memray: Memray is a memory profiler for Python to gather a memory profile of the application from all the workers. What I am looking for instead is a runtime profiling, i.e. what is given by the dask performance_report function. But that only covers Python code, whereas what I’m looking for is something akin to the output of perf
+ flamegraph
, i.e. what is also achieved via Python support for the Linux perf profiler — Python 3.12.4 documentation . But clearly I would need a Dask support for a plugin that automatically runs the profiling on the worker process and creates a flamegraph out of it.
Cheers,
Vincenzo
OK, not an expert here. I don’t think that exist, but you can look at memray implementation or GitHub - gjoseph92/dask-pyspy: Profile the dask distributed scheduler with py-spy and viztracer to see how it has been done, and maybe propose some tool?