Dask slower than numpy

I am a new dask user and I’m trying to run the function dot inside my program. I noticed that the function dot of dask is slower than its numpy version even when I use only one chunk in the whole matrix. How this behavious can be explained?

import dask.array as da 
import numpy as np
x = da.random.normal(10, 0.1, size=(20000 * 100000), chunks=(20000 * 100000))
z = x.dot(x)
%time z.compute()
'''
CPU times: user 1min 1s, sys: 17.3 s, total: 1min 18s
Wall time: 52 s
'''
y = x.compute()

%time w =y.dot(y)
'''
CPU times: user 19 s, sys: 8.24 s, total: 27.2 s
Wall time: 767 ms
'''

Hi @ahmed,

You’re not comparing the same thing between Dask and Numpy.

Dask is lazy, so the first two lines:

x = da.random.normal(10, 0.1, size=(20000 * 100000), chunks=(20000 * 100000))
z = x.dot(x)

Does nothing. When timing z.compute(), you’re timing the generation of the random array plus the dot operation.

In Numpy, you’re only timing the dot operation (the array has already be created into memory), so this is fast.

See what I get (with a smaller input):

Dask:

x = da.random.normal(10, 0.1, size=(20000 * 20000), chunks=(20000 * 20000))
z = x.dot(x)
%time z.compute()
CPU times: user 10 s, sys: 646 ms, total: 10.7 s
Wall time: 9.97 s

40003968048.35344

Numpy:

%%time
a = np.random.normal(10, 0.1, size=(20000 * 20000))
b = a.dot(a)
CPU times: user 10.6 s, sys: 355 ms, total: 10.9 s
Wall time: 10.5 s

And with chunking and Dask:

x = da.random.normal(10, 0.1, size=(20000 * 20000), chunks=(10000 * 10000))
z = x.dot(x)
%time z.compute()
CPU times: user 10.7 s, sys: 1.13 s, total: 11.8 s
Wall time: 2.98 s

40004001781.91481
2 Likes