Using dask to rescale a large numpy array

Hii.

I am trying to scale an image along one axis to make the volume isotropic. The image is a numpy array of shape (95,3739,1895).

I tried using dask as follows:

import dask.array as da

tiles = da.from_array(iamge, chunks=(95,512,512)) ## Image divided as chunks

from scipy.ndimage import zoom

def procedure(target):
  print("proceduring",target.shape)
  return zoom(target, [10,0.994,0.994])

tile_map = da.map_overlap(procedure, tiles)
result = tile_map.compute()

This had almost finished because it had printed “proceduring” 32 times (equal to number of chunks). But it hit the maximum memory and hence crashed the system. My system has 64 Gb RAM.

When I created lot more chunks ( used chunks=(95,25,25)), it also crashed (didn’t even complete processing all the chunks I guess).

Is there a better way to get this interpolation done reasonably fast somehow using dask?

Hi @dub2s, welcome to Dask community!

So I just tried your example with some random Array as input image, and a distributed cluster to monitor the computation using the Dashboard, like so:

from distributed import Client
import dask.array as da
from scipy.ndimage import zoom

client = Client()
client

image = da.random.random((95,3739,1895), chunks=(95,512,512))

def procedure(target):
  print("proceduring",target.shape)
  return zoom(target, [10,0.994,0.994])

tile_map = da.map_overlap(procedure, image)
tile_map.compute()

Using also a Jupyter notebook, I’m able to print the shape and size of the arrays. Dask is telling me that input image (using float64 pixels) is 5.02GiB, with 190MiB chunks. It’s also telling me the output is the same. Is that correct?

When I try to launch it with default Client/LocalCluster on my machine, it’s launching 8 tasks on parallel. I don’t know what the zoom operation is doing or at least how does it works, but for every chunk processed, the memory is gradually growing to about 2GiB, and it keeps a result between 1.3 and 1.8 GiB in memory. So my 16GiB laptop is quickly saturated and everything crashes.

The thing is, as you are calling compute, Dask tries to keep the chunks in memory, or spilling them to disk if it needs some space, in order to be able to return the image in the end, but this is not enough here.

Is this normal to have such an amount of memory used for each chunk processing when using zoom?

I think this would generate a bit too much chunks to handle.

From what I observed, I think there are several things to check:

  • Instead of computing the result (which means it will be loaded into memory in the end, and Dask has to keep all the chunks in memory or spilled on disk), try to save it as a chunked format to disk so that it can be written in streaming, for example using Zarr.
  • Try with chunks just a bit smaller, like (95,128,128)
  • Try with less threads working at the same time, for example using Client(n_workers=4, threads_per_worker=1)

I tried with using chunks a bit smaller and fewer threads working at the same time, it did not get to the end of my machine because I didn’t have enough free space on local disk to spill all the resulting chunks, but I think that was in good way to finish.