Calculating Average Image Intensity Over Zarr Chunks

Hello Dask world! I was wondering about how to properly do an average of each frame inside a zarr store. I have about 56k images in a zarr (here’s a link to a sample of 500 of them) and was guided by @d-v-b to try something like this (or at least this is how I interpreted what he suggested, could have done so incorrectly!). Each image is 512x512 pixels.

def compute_average_intensity(zarrurl):
    z = dask.array.from_zarr(zarrurl)

    avgs = z.mean(axis=(0,1))

    res = avgs.compute()

    print(res.shape)

When I do this, I get 512 as the shape of the result which is definitely not right. I’m guessing I’m using the axis argument incorrectly. Either way, I had anticipated getting the number of images contained in the zarr since I was hoping to just take the average of each individual frame instead of averaging across the chunks which is what I think I did on accident, or perhaps I did an average over the whole zarr for each row or something?

I’ve been tinkering around with map_blocks and iterating through each frame like @ParticularMiner showed me how to do here but have been struggling to get that quite right. I’m also not sure if its necessary to do with map_blocks vs just the built in mean function like in the code block above.

I should have tried for just a couple more minutes! I wasn’t averaging over the correct axes.

The input zarr shape is (500, 512, 512), so (z, x, y). I had in my head that it was instead (512, 512, 500), so (x, y, z). If you do:

def compute_average_intensity(zarrurl):
    z = dask.array.from_zarr(zarrurl)

    avgs = z.mean(axis=(1,2))

    res = avgs.compute()

    print(res.shape)

You get what seems to be the correct answer! The shape of your result is (500,) with the test images.

Check your input data dims folks!