Hello folks,
I’m trying to do a calculation of a 30 GB inside 4 clustered GPUs.
Even if I split this data into small chunks of 100 MB, the memory increase so much that it reports allocation issues.
The point is… How can I efficiently profile the GPU memory usage of my process? For further information, I’m using CuPy and Dask Arrays.
If I use only CPU and local memory, I could easily use the dask-memusage
plugin, but unfortunately, it does not work with GPUs.
I’m not using the dashboard because I’m running on a cluster that does not let me open ports externally.
Any thoughts and suggestions are welcome.
Hi @jcfaracco,
Did you go through
https://distributed.dask.org/en/stable/diagnosing-performance.html
or
https://docs.dask.org/en/stable/diagnostics-distributed.html
There might be some useful tools here like performance_report
or MemorySampler
for example. I’m not sure how it goes along with GPUs.
The only other solution I see is using external tooling like nvidia-smi
(there might be some package that are able to record output of this command).
You could also try without GPUs and see how it goes.
Also, did you tried to use SSH port forwarding?
I wrote a similar plugin to dask-memusage
. If anyone is interested: GitHub - discovery-unicamp/dask-memusage-gpus: A thread based and low-impact GPU memory profiler for Dask.
It is missing documentation, but this is something I will do in the next weeks.
1 Like