# How to get the original i,j,k location in blockwise operation

If I have an operation like `dask_array_object.blocks` and iterate over the blocks resulting from that, how can I get the original `i,j,k` location of each part inside of the function that is executing in a blockwise fashion?

Also, I noticed that there is an option of using the function `dask.array.blockwise()` for doing blockwise operations, same question. How can you know the original `i,j,k` location of the part in the dask array inside of the function executing in a blockwise fashion?

@gavargas22 Welcome to Discourse! I think you can check out Dask Arrayâ€™s `map_blocks`, it includes `block_id` and `block_info` keyword arguments that store the chunk location and more information respectively:

``````import dask.array as da

x = da.random.randint(100, size=(10,10), chunks=(5,5))

def func(x, block_id=None, block_info=None):
print ("block_id = ", block_id)
print ("block_info = ", block_info)
return x+1

da.map_blocks(func, x).compute()
``````
2 Likes

Thatâ€™s a good solution, I tried this, but the function that I am executing with `map_blocks` is a function that writes to a file on disk in parallel and in your example I return x (which is the block); so when the whole operation completes, I get a full sized numpy array.

Is there a way to return something that is empty, and does not occupy the space? If so, map_blocks would be perfect for me

My data is a 300 GB dask array, and I just want to take the values of each block and write them into a file

I am atempting something like this:

``````import dask.array as da

x = da.random.randint(100, size=(2000,2000,2000)))

def func(x, block_id=None, block_info=None):
# Grab the values of the 3D cube from Zarr disk store
block_data = x.compute()
# Function that writes the actual values to disk
write_value_to_binary(block_data, "./file/datafile.bin")
# Attempt to release the memory?
x.close()

return x

da.map_blocks(func, x).compute()
``````

Hi @gavargas22! In your snippet you mention youâ€™re reading the array from Zarr, in which case you can use `dask.array.from_zarr` and `dask.array.to_zarr`, without necessarily needing to call `compute`. There are more details here, but you can do something like:

``````import dask.array as da

# save zarr file
x = da.random.randint(100, size=(10,10), chunks=(5,5))
x.to_zarr('test.zarr')
y = da.from_zarr('test.zarr')
# do some stuff
z = y[::2, 5000:].mean(axis=1)
# save result
z.to_zarr('result.zarr')
``````
1 Like

@gavargas22 it looks like you also posted this question to stack overflow, where there is another answerâ€“ do any of these address your question?

It seems that I can return the smallest array I can on each iteration of `map_blocks` and solve my problem.

But I also have the question of: When you use `.blocks.ravel()` How do you reconstruct the original shape and block arrangement into another numpy-like object.

I want to take each block from `.blocks.ravel()` do some computation and then write to a binary file at the specific `i,j,k` locations where the block is supposed to be.

Essentially, I am writing a binary file in a piecewise fashion.

It turns out `numpy` is quite strict about what one can and cannot be converted into an array. Essentially an object must be â€śarray-likeâ€ť to be able to get converted. `.blocks.ravel()` is not, since its items have different shapes in general.

You could however spoof `numpy` into thinking `.blocks.ravel()` is array-like by wrapping each of its items with an arbitrary non-array-like object:

``````import numpy as np

class wrapped():
def __init__(self, block):
self.view = block

# an example dask array `x` of arbitrary shape and chunks:
x = da.from_array(np.arange(4*3*5).reshape((4, 3, 5)), chunks=2)
x_block_list = x.blocks.ravel()
blocks = np.array([wrapped(block) for block in x_block_list])
``````

which can then be reshaped into the desired structure:

``````block_array = blocks.reshape(x.numblocks)
``````

whose elements can be referenced with block indices. For example,

``````block_array[1, 0, 2].view.compute()
``````

But I find the above approach somewhat unwieldy, since the easiest way to directly reference a block of a `dask` array `x` (if thatâ€™s what you really want) is to do:

``````x.dask[(x.name, 1, 0, 2)]
``````
3 Likes