Is there a way to specify affect the order `map_blocks` function

odinsbane · November 11, 2025, 1:26pm

I’m using map blocks to process data and save a zarr file lazily. Is there a way to process the values in order? Currently as it writes chunks, some chunks are early and some chunks are late. It would be nice if they were processed in order.

out = dask.array.map_blocks( torchit, dtype="uint16", chunks = chunks, meta=numpy.array((), dtype="int16"))

It’s not clear from the documentation but it seems like both dtype and meta are redundant in this example? Should I prefer one or the other?

The saved values are often really late in the series. As in I have 200 frames to process, and it will process 0, 10, 40 and 150 before 1 or 2 has been written.

Thanks

guillaumeeb · November 12, 2025, 5:04pm

Hi @odinsbane, welcome to Dask community!

Depending on your workflow, it might not be simple, do you have more details of the operations you perform?

You can see in the documentation that you can visualize tasks priority from your tasks graph. This page also mention the kwarg inline_array=True that might be useful.

In simple case dtype is enough, but meta might be needed for other.

guillaumeeb · November 12, 2025, 5:11pm

And if the above does not give you a good solution, you might want to play with priority, as described in the example on this page:

with dask.annotate(priority=lambda k: k[1]*nblocks[1] + k[2]):
    A = da.ones((1000, 1000), chunks=(100, 100))

odinsbane · November 13, 2025, 1:04pm

Thanks for the response!

As for my workflow, I am using ngff_zarr, to load a zarr image and then processing the image by converting chunks to numpy arrays and processing that with cellpose.

Essentially my work flow looks like.

dask_array = ngff_zarr.load_my_zarr_file()
def process( block_id, data=dask_array ):
    y = cellpose_model.eval( numpy.array( data[block_id[0] ) )
    return y
out = dask.array.map_blocks( process, dtype="uint16", chunks=chunks )
ngff_zarr.save_out_as_zarr()

Can I use inline? It seems like I cannot because it would need to happen at the point of dask array creation. It seems to be a similar situation with annotate, although maybe I could use annotate, then load the zarr file.

Another alternative would be to load the zarr with dask.array.from_zarr I could use ngff_zarr to handle the metadata, then use from_zarr to load the same backing array but with an inline processing order.

In simple case dtype is enough, but meta might be needed for other.

Good to know.

guillaumeeb · November 14, 2025, 11:42am

Inline would probably not work here.

Annotate should work though, with the proper with syntax.

I’m not familiar with ngff_zarr, but this proposition, or digging a little into possible options you can give to ngff_zarr can be another lead.

odinsbane · November 15, 2025, 12:33pm

I’m not familiar with ngff_zarr, but this proposition, or digging a little into possible options you can give to ngff_zarr can be another lead.

I looked but didn’t see anything obvious. I tried loading the dask array using the inline argument, but it didn’t change the order that things get written.

Looking at the map_blocks, I don’t know how it would know to change the order based on the dask array provided. I guess it sets up a whole computational graph and goes to work, so if object in the graph demands a particular order then that is the order it will work from.

guillaumeeb · November 17, 2025, 11:59am

Did you try using the visualize code from the documentation link to see how Dask intend to process your graph?

Would you be able to create a reproducer?

odinsbane · November 17, 2025, 7:17pm

If I include visualize, I get wide graph.

This is the code I am using, it makes a [cellpose prediction](zarr-recipes/src/scripts/predict_cellpose-2.py at master · Living-Technologies/zarr-recipes · GitHub) from a zarr file.

guillaumeeb · November 28, 2025, 8:26am

Could you try to use visualize with color="order" kwarg in order to see how Dask intend to process the graph?

Topic		Replies	Views
Change array shape with map_block function Dask Array	1	162	November 16, 2023
Prevent dask array from `compute()` behavior Dask Array dask-array	9	943	March 19, 2022
Using da.delayed for Zarr processing: memory overhead & how to do it better? Dask Array dask-array , delayed	15	1938	January 16, 2024
Why do chunks get inverted? Dask Array dask-array	2	195	August 21, 2023
How to get the original i,j,k location in blockwise operation Dask Array dask-array	6	859	February 22, 2022

Is there a way to specify affect the order `map_blocks` function

Related topics