*This was originally posted on GitHub Discussions: fast dask-parallel in-place modification of xr.Dataset data (specific dimensions) · Discussion #9426 · dask/dask · GitHub*

[ Re-posted here following advice from the xarray-dev community: fast dask-parallel "shuffling" of xr.Dataset dimensions · Discussion #6951 · pydata/xarray · GitHub ]

Hi there,

I am trying to come up with a fast, Dask-parallel way of “shuffling” the contents of an xr.Dataset (or Datarray) along certain dimensions. An example below:

```
# create LocalCluster, dask Client, etc.
# open a large netCDF file
dset = xr.open_dataset(..., chunk={"t": 1, "z": 10})
# dset contains a single variable called "dummy" with dimensions (t, z, y, x) == (5, 50, 1000, 1000)
def pseudo_shuffle_xy(ds: xr.Dataset) -> xr.Dataset:
"""Shuffle values along (x, y) dimensions and return the shuffled dataset."""
for k in range(len(ds.z)):
dict_isel_ = {
'x': xr.DataArray(data=np.random.randint(low=0, high=nx, size=(ny, nx)), dims=['y', 'x']),
'y': xr.DataArray(data=np.random.randint(low=0, high=ny, size=(ny, nx)), dims=['y', 'x'])
}
# not proper shuffling, but close enough
ds.dummy.loc[dict(z=k)] = ds.dummy.isel(z=k).isel(dict_isel_)
return ds
result = dset.map_blocks(pseudo_shuffle_xy, template=dset).compute()
```

So the call to map_blocks seems to work as expected but it’s likely slower than it has to be - and the manual loop is ugly.

I’ve also tried to use `xr.apply_ufunc`

:

```
def pseudo_shuffle_xy_v2(da: xr.Dataset):
nx, ny = da.shape[0], da.shape[1] # da.shape = (1000, 1000, 50, 5), chunk size = (1000, 1000, 5, 1)
idx_shuffled_x, idx_shuffled_y = np.random.permutation(nx), np.random.permutation(ny)
return da.vindex[idx_shuffled_x, ...][:, idx_shuffled_y, ...]
result = xr.apply_ufunc(pseudo_shuffle_xy_v2, dset_dummy, input_core_dims=[["z", "t"]], output_core_dims=[["z", "t"]], dask="allowed", vectorize=True).compute()
```

This is faster than the `map_blocks`

call but i get warnings about dask creating large chunks - likely as a result of the `vindex`

call.

What’s the best way to vectorize the “shuffle” (in-place assignment, really) using Dask?