# Why do chunks get inverted?

Hey,

I have a question about the reasoning behind the implementation of chunks.
When I call map_blocks on two arrays, naturally the chunks need to match.
I noticed by trail and error that the chunks need to align in reverse, if the dimensions don’t align.
To illustrate this, see the following example and note that arr2 and arr3 have their chunks swapped:

``````import numpy as np

return x + y

arr1 = da.arange(10, chunks=((1,3,3,3),))
arr2_np = np.arange(100).reshape((10, 10))
arr2 = da.from_array(arr2_np, chunks = ((1,3,3,3), (5,5)) )
arr3 = da.from_array(arr2_np, chunks = ((5,5), (1,3,3,3)) )

``````

The first map_blocks call (with arr2) fails with the following error:

``````Traceback (most recent call last):
File "<stdin>", line 1, in <module>
(result,) = compute(self, traverse=False, **kwargs)
dsk = collections_to_dsk(collections, optimize_graph, **kwargs)
dsk = opt(dsk, keys, **kwargs)
dsk = fuse_roots(dsk, keys=keys)
new = toolz.merge(layer, *[layers[dep] for dep in deps])
File "/home/timo/Documents/open_source/tmillenaar/dask/venv/lib/python3.10/site-packages/toolz/dicttoolz.py", line 39, in merge
rv.update(d)
File "/usr/lib/python3.10/_collections_abc.py", line 886, in __iter__
yield from self._mapping
return iter(self._dict)
dims=self.dims,
self._dims = _make_dims(self.indices, self.numblocks, self.new_axes)
raise ValueError("Shapes do not align %s" % g)
ValueError: Shapes do not align {'.0': {2, 4}, '.1': {4}}
``````

The second attempt (with arr3) works.

Intuitively, I expected the first (and only) axis of arr1 to align with the first axis arr2, but it seems instead it aligns with the last axis of arr2.
This is because the order is deliberately flipped here:

Now I wonder, why is this being flipped?

Cheers,
Timo

P.S.
I was attempting to debug an issue I filed a while back.

It was not given any attention so I decided to give it a try.
I realized though that I might just not understand the reason for the design decisions made in the past.

Okay, I’m not an array expert, but I believe this has to do with Numpy broadcasting rules. I just tried the following, building fake chunks with similar shape as those from `arr2` and `arr3` in your example:

``````arr2_chunk = np.arange(15).reshape((3, 5))
arr1.blocks[1].compute() + arr2_chunk
``````

results in:

``````---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[29], line 1
----> 1 arr1.blocks[1].compute() + arr2_chunk

ValueError: operands could not be broadcast together with shapes (3,) (3,5)
``````

Whereas:

``````arr3_chunk = np.arange(15).reshape((5,3))
arr1.blocks[1].compute() +  arr3_chunk
``````

just works:

``````array([[ 1,  3,  5],
[ 4,  6,  8],
[ 7,  9, 11],
[10, 12, 14],
[13, 15, 17]])
``````

So `map_blocks` is just checking if the Numpy operations can be performed according to the shape of the Arrays.

Thanks @guillaumeeb, that must be it!

I just had a look at numpy’s broadcasting rules because of your suggestion. Intuitively I was trying to match up the first axis of arr1 with the first axis of arr2. I think I made the following mistake:

It nevertheless has some funny consequences when using map_blocks but at least I understand now where these originate.

Cheers,
Timo

1 Like