Memory error using dask-image ndmeasure.label

Hi,

I am obtaining an OOM error when using ndmeasure.label with a large array.
E.g.,

nx = 5120 # things are OK < 2500
arr = da.random.random(size=(nx,nx,nx))
darr_bin = arr > 0.8
# The next line will fail
label_image, num_labels = dask_image.ndmeasure.label(darr_bin)

Note that this problem occurs already in the last line, and not when executing the computation via, e.g., num_labels.compute().
This also means that I have the same problem when using a (large) cluster as the OOM always occurs on node 1.

I am not sure why this happens and how to prevent this (splitting the array manually is a workaround but of course one misses the overlapping structures), so any help will be appreciated.

Thanks!

Hi @maxbeegee, welcome to Dask community!

I confirm the problem you encounter. Looking at label function source code, a lot of things are done there, which I don’t fully understand.

cc @Genevieve @jakirkham

Maybe it would be better to raise an issue there?

Thanks @guillaumeeb! Sorry but what do you mean exactly with “there”? Open an inssue on github?

Oh yes, sorry, I was meaning opening an issue on dask-image github issue tracker.

Thanks! I have done so now: Memory error using dask-image ndmeasure.label · Issue #391 · dask/dask-image · GitHub

1 Like