Optimal way to monitor GPU memory usage during distributed training (XGBoost)

guillaumeeb · June 27, 2025, 2:44pm

Hi @ap213, welcome to Dask Discourse forum!

First, I would like to be sure you are really launching computations on GPUs, are I don’t see any hints of that in your code. Are you configuring something, somewhere, to be sure that the code is running on GPUs? From the code I see, you are creating standard Dask array, so they would be held in server main memory and using CPUs, creating a LocalCUDACluster is not enough, but maybe you just didn’t put some part of the code.

To be more specific, you should use cupy or use it as a backend, as in XGBoost example:

with Client(cluster) as client, dask.config.set({"array.backend": "cupy"}):

Next, or in the meantime, I would also check that GPUs are correctly used by using system tools like nvidia-smi. If you see some usage here, you should be able to get it from Python.

You can also use the Dask dashboard which has gpu support if dask-cuda is installed.

Topic		Replies	Views
GPU memory within container Deploying Dask	3	242	August 2, 2023
How to efficiently monitor GPU usage without a dashboard? Distributed gpu	3	531	July 19, 2024
API to access diagnose dashboard data	2	187	May 11, 2022
How to monitor GPU usage using dashboard dashboard , gpu	3	76	August 21, 2024
Why distributed Xgboost runing on local cluster doesn't work? Distributed distributed	2	249	November 26, 2021

Optimal way to monitor GPU memory usage during distributed training (XGBoost)

Related topics