Thanks for your help. I suspect that my use case is a little different from the one that
map_overlap was designed for. As I understand,
map_overlap is mainly useful if you are mapping a function which takes a local neighbourhood, and returns a single value (for example, taking a derivative). What I want to do is map a function which takes a local neighbourhood, and returns a local neighbourhood. Those returned local neighbourhoods would then overlap in the final array.
What exactly ‘overlapping’ means is tricky - consider the various blending modes in image software. My need is for overlapping values to simply be summed (essentially an additive/linear-dodge blending mode).
I asked this question because I felt wanting to return local neighbourhoods, and summing overlapping neighbourhoods, was somehow a natural idea, and was curious if there was a way of doing this with
map_overlap that I was missing.
I’ve put together the following MWE for a 2D array, which hopefully clarifies what I am trying to achieve:
import numpy as np
import dask.array as da
arr = da.random.random((100, 100), chunks=(10, 10))
depth = 2
result = da.map_overlap(lambda x: x**2, arr, depth=depth, boundary=0, trim=False)
result_untrimmed = result.compute()
# Re-shape/re-slice the result array into the original shape, using an additive blending mode.
# Original chunks
chunk_sizes = result.chunks
chunk_indices = [np.cumsum(np.concatenate([, chunk_sizes[:-1]])) for chunk_sizes in chunk_sizes]
# Chunks with overlap (call them "blocks")
block_sizes = tuple(np.array(chunk_sizes) + 2 * depth for chunk_sizes in chunk_sizes)
block_indices = [np.cumsum(np.concatenate([, block_sizes[:-1]])) for block_sizes in block_sizes]
# Prepare final array
result_final = np.zeros(np.array(arr.shape) + 2 * depth)
# Iterate over the chunks/blocks, and add them to the final result array
for i in range(len(chunk_sizes)):
for j in range(len(chunk_sizes)):
# Get the size of the current block
block_size = block_sizes[i], block_sizes[j]
# Get the block
block = result_untrimmed[block_indices[i]:block_indices[i] + block_size,
block_indices[j]:block_indices[j] + block_size]
# Get the indices of the current chunk in the final array
i_diff, j_diff = chunk_indices[i], chunk_indices[j]
# Add the current block to the final array
result_final[i_diff:i_diff + block_size, j_diff:j_diff + block_size] += block
# If desired, trim the outer excess now.
result_final = result_final[depth:-depth, depth:-depth]
Of course, if for some reason the chunk sizes are relatively small and depth is large, such an approach can take up a lot of memory!