Hello,
I’m wondering why in my example using GPU (5.7s) is ~4 times lower than using CPU (1.5) :
import dask.array as da
import cupy as cp
from dask_cuda import LocalCUDACluster
from dask.distributed import Client, LocalCluster
from datetime import datetime as dtt
N= 10
chunk_unit = 20
if __name__ == "__main__":
# generate random numbers on cpu
m = da.random.normal(size=(N*chunk_unit, 1024, 1204), chunks=(chunk_unit, 1024, 1024))
# with cpu
with LocalCluster() as cluster, Client(cluster) as client:
# take average
st = dtt.now()
c = m.mean(axis=0).compute()
print("finish CPU", dtt.now() - st, c[0,0], sep='\n')
# with gpu
with LocalCUDACluster() as cluster, Client(cluster) as client:
# move to gpu useful ?
n = da.map_blocks(lambda a: cp.asarray(a),
m,
dtype=float,
meta=cp.array([]),
)
# take average
st = dtt.now()
c = n.mean(axis=0).compute()
print("finish GPU", dtt.now() - st, c[0,0], sep='\n')
Returns :
finish CPU
0:00:01.478294
0.0368461756129715
and
/opt/conda/lib/python3.12/site-packages/dask_cuda/utils.py:171: UserWarning: Cannot get CPU affinity for device with index 0, setting default affinity
warnings.warn(
finish GPU
0:00:05.684561
0.0368461756129715
I’m working with WSL2 on WS11.
nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.77.01 Driver Version: 566.36 CUDA Version: 12.7 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4060 ... On | 00000000:01:00.0 Off | N/A |
| N/A 32C P8 3W / 30W | 0MiB / 8188MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
and
nvidia-smi topo -m
GPU0 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X N/A
What do you suggest ?
Thanks
François