So, while experimenting with your script on my puny laptop (8-core CPU, 16GB RAM), I discovered that the processes-scheduler and not the threaded-scheduler (the default) yielded the better performance on a single machine. The former used all the cores at 100% while the latter used them at only 33%. Consequently, the former was almost 3 times faster than the latter. It turns out, the reason for this outcome lay in the guts of Python itself, namely, the so-called Python GIL (Global Interpreter Lock) which severely limits multi-threading.
And as expected, the memory footprint during runtime depended proportionally on the chunksize (that is, the nframes
parameter to dask_image.imread.imread()
). So nframes
should not be set too high, otherwise your computer’s RAM will be overwhelmed.
... # import statements from your script
from dask.distributed import Client
... # leading script statements
client = Client(threads_per_worker=1)
video_path = "/path/to/video.mp4"
pims.ImageIOReader.class_priority = 100
# I'm sure Cheetos could manage a larger `nframes` value here:
original_video = dask_image.imread.imread(video_path, nframes=32)
... # rest of the script with `rechunk()` and `persist()` statements removed!!
client.shutdown()
Alternatively, if you don’t want to import Client()
from dask.distributed
as in the code-snippet above, then simply place your .to_zarr()
calls within a dask.config.set(scheduler='processes')
context manager:
... # leading script statements
# I'm sure Cheetos could manage a much larger `nframes` value here:
original_video = dask_image.imread.imread(video_path, nframes=32)
... # rest of script with `rechunk()`, `persist()`, and `.to_zarr()` statements removed!!
with dask.config.set(scheduler='processes'):
da.to_zarr(hog_images, "/path/to/hog_images/data.zarr", compressor=compressor)
da.to_zarr(hog_descriptors, "/path/to/hog_descriptors/data.zarr", compressor=compressor)
... # etc.
And oh, when I ran the processes-scheduler, some PIMS warning messages got repeatedly spewed out reporting an AttributeError
exception . Simply ignore them. They do not appear to affect the final result.
I later discovered (see this link), that those warnings occur because, unlike in the threaded-scheduler, the workers in the processes-scheduler do not share the updated value of ImageIOReader.class_priority
. So pims.open()
attempts to use a couple of other readers with higher priority than ImageIOReader
. Luckily, however, those other readers fail for some reason (on my laptop), allowing ImageIOReader
to be eventually used by pims.open()
anyway. I expect similar behavior to occur on Cheetos.
I hope this helps.