Hi @dub2s, welcome to Dask community!
So I just tried your example with some random Array as input image, and a distributed cluster to monitor the computation using the Dashboard, like so:
from distributed import Client
import dask.array as da
from scipy.ndimage import zoom
client = Client()
client
image = da.random.random((95,3739,1895), chunks=(95,512,512))
def procedure(target):
print("proceduring",target.shape)
return zoom(target, [10,0.994,0.994])
tile_map = da.map_overlap(procedure, image)
tile_map.compute()
Using also a Jupyter notebook, I’m able to print the shape and size of the arrays. Dask is telling me that input image (using float64 pixels) is 5.02GiB, with 190MiB chunks. It’s also telling me the output is the same. Is that correct?
When I try to launch it with default Client/LocalCluster on my machine, it’s launching 8 tasks on parallel. I don’t know what the zoom
operation is doing or at least how does it works, but for every chunk processed, the memory is gradually growing to about 2GiB, and it keeps a result between 1.3 and 1.8 GiB in memory. So my 16GiB laptop is quickly saturated and everything crashes.
The thing is, as you are calling compute
, Dask tries to keep the chunks in memory, or spilling them to disk if it needs some space, in order to be able to return the image in the end, but this is not enough here.
Is this normal to have such an amount of memory used for each chunk processing when using zoom
?
I think this would generate a bit too much chunks to handle.
From what I observed, I think there are several things to check:
- Instead of computing the result (which means it will be loaded into memory in the end, and Dask has to keep all the chunks in memory or spilled on disk), try to save it as a chunked format to disk so that it can be written in streaming, for example using Zarr.
- Try with chunks just a bit smaller, like (95,128,128)
- Try with less threads working at the same time, for example using
Client(n_workers=4, threads_per_worker=1)
I tried with using chunks a bit smaller and fewer threads working at the same time, it did not get to the end of my machine because I didn’t have enough free space on local disk to spill all the resulting chunks, but I think that was in good way to finish.