Performing HOG Matrices on PIMS Chunks through ImageIO

Update for this thread:

It appears that things work well on a Windows machine which relies upon spawn for creating processes while on a Linux machine it relies upon fork for creating processes. This could be the reason things are struggling when writing to zarrs later, but am awaiting further information from Zarr/Dask developers.

See here for reference about multi-processing initialization.

EDIT:

Changed solution from @ParticularMiner above since as it was written there didn’t quite work for my particular use case on Linux. New solution will be coming soon!

Oh good, I take it things are working reasonably well now?

1 Like

Hey @Genevieve !

Welcome on this thread!

On Microsoft Windows, yes.

I don’t have access to a full-fledged Linux machine, so I don’t know for Linux. I believe @jmdelahanty will be able to tell, perhaps later today.

2 Likes

Hey @Genevieve!

Things do now work on a Linux machine! They will also work on Windows.

I’ve learned that earlier versions of the code had me putting the to_zarr arguments in the incorrect places. I was doing to_zarr(path, array) when you need to do to_zarr(array, path)! Such a silly error, but one that @jakirkham, @martindurant , and @ParticularMiner pointed out to me for an easy fix.

Here’s the full updated code for what’s currently operating on Cheetos (a computer in our cluster). It runs through a video of about 44k frames in just about 8 minutes! Very cool!

# Adapted repository for mouse facial expression analysis from Dolensek et al 2020
# Written with the gracious help of John Kirkham, Josh Moore, and Martin Durant (Dask/Zarr Developers)
# ParticularMiner, Jeremy Delahanty May 2022

import numpy as np
import time
import warnings
# Import Histogram of Oriented Gradients (HOG) function for expression assessment
from skimage.feature import hog
# Use Blosc as compressor for zarrs written to disk
from numcodecs.blosc import Blosc
# Use PIMS for laziliy loading video files into numpy arrays
import pims
# Use dask.config to set scheduler as using processes
import dask.config
# Use dask.array for using to_zarr method later
import dask.array as da
# Use dask_image.imread to read in video data via pims and create dask arrays as a result
import dask_image.imread

# You could use this class that ParticularMiner wrote to read video into grayscale; left since it is
# informative on how to use custom PIMS readers.

# class GreyImageIOReader(FramesSequence):
#     class_priority = 100  # we purposefully set this very high to force pims.open() to use this reader
#     class_exts = pims.ImageIOReader.class_exts

#     def __init__(self, filename, **kwargs):
#         self._reader = pims.ImageIOReader(filename, **kwargs)
#         self.filename = self._reader.filename
    
#     def close(self):
#         self._reader.close()  # important since dask-image uses pims.open() as a context manager

#     def get_frame(self, n):
#         return pims.as_grey(self._reader.get_frame(n))

#     def __len__(self):
#         return self._reader.__len__()

#     @property
#     def frame_shape(self):
#         return self._reader.frame_shape[:-1]

#     @property
#     def pixel_type(self):
#         return self._reader.pixel_type

def as_grey(frame):
    """Convert a 2D image or array of 2D images to greyscale.

    This weights the color channels according to their typical
    response to white light.

    It does nothing if the input is already greyscale.
    (Copied and slightly modified from pims source code.)
    """

    if len(frame.shape) == 2 or frame.shape[-1] != 3:  # already greyscale
        return frame
    else:
        red = frame[..., 0]
        green = frame[..., 1]
        blue = frame[..., 2]
        return 0.2125 * red + 0.7154 * green + 0.0721 * blue

def make_hogs(frames, coords, kwargs):
    """
    Performs cropping and HOG generation upon chunk of frames.

    A chunk of frames received from dask_image.imread.imread is operated
    on with this function. The coordinates supplied are from cropping performed
    in a previous step on an example image and are used for cropping the chunk
    of frames supplied. The kwargs used define arguments that hog() expects.
    In this use case, both the HOG images and descriptors are returned.
    """

    # frames will be a chunk of elements from the dask array
    # coords are the cropping coordinates used for selecting subset of image
    # kwargs are the keyword arguments that hog() expects
    # Example:
    # kwargs = dict(
    # orientations=8,
    # pixels_per_cell=(32, 32),
    # cells_per_block=(1, 1),
    # transform_sqrt=True,
    # visualize=True
    # )

    # Perform cropping procedure upon every frame, the : slice,
    # crop the x coordinates in the second slice, and crop the y
    # coordinates in the third slice. Save this new array as 
    # new_frames
    new_frames = frames[
        :,
        coords[1]:coords[1] + coords[3],
        coords[0]:coords[0] + coords[2]
    ]

    # Get the number of frames and shape for making
    # np.arrays of hog descriptors and images later
    nframes = new_frames.shape[0]
    first_frame = new_frames[0]

    # Use first frame to generate hog descriptor np.array and
    # np.array of a hog image
    hog_descriptor, hog_image = hog(
        first_frame,
        **kwargs
        )

    # Make empty numpy array that equals the number of frames passed into
    # the function, use the fed in datatype as the datatype of the images
    # and descriptors, and make the arrays shaped as each object's shape
    hog_images = np.empty((nframes,) + hog_image.shape, dtype=hog_image.dtype)
    hog_descriptors = np.empty((nframes,) + hog_descriptor.shape, dtype=hog_descriptor.dtype)

    # Until I edit the hog code, perform the hog calculation upon each
    # frame in a loop and append them to their respective np arrays
    for index, image in enumerate(new_frames):
        hog_descriptor, hog_image = hog(image, **kwargs)
        hog_descriptors[index, ...] = hog_descriptor
        hog_images[index, ...] = hog_image
    
    return hog_descriptors, hog_images


def get_ith_tuple_element(tuple_, i=0):
    """
    Utility function for grabbing different data out of each
    returned tuple of numpy arrays from make_hogs().
    0 = images
    1 = descriptors
    """
    return  tuple_[i]


def normalize_hog_desc_dims(tuple_):
    # add more dimensions (each of length 1) to the hog descriptor chunk in
    # order to match the number of dimensions of the hog_image
    descriptor = tuple_[0]
    image = tuple_[1]
    if descriptor.ndim >= image.ndim:
        return tuple_[0]
    else:
        return  np.expand_dims(
            tuple_[0], axis=list(range(descriptor.ndim, image.ndim))
        )

# For each instance of PIMS opening the video it seems that it will warn you that
# the number of frames in  your video will not divide cleanly. You can ignore this
# warning.
warnings.filterwarnings('ignore', '`nframes` does not nicely divide')

# Wrap functions in a __name__ = "__main__" statement for allowing threads to spawn
# (fork in Linux case) properly. The reasons for this are still fuzzy to me...
if __name__ == "__main__":

    program_start = time.perf_counter()
    
    video_path = "/path/to/movie.mp4"

    pims.ImageIOReader.class_priority = 100  # we set this very high in order to force dask's imread() to use this reader [via pims.open()]

    # These coords are determined beforehand, should later be loaded from disk or have fiducial cropping performed so everything is the same
    coords = (496, 174, 992, 508)

    # Use dask_image's reader to read your frames via PIMS, change nframes according to
    # your chunksize, which will likely be determined by your machine's available RAM mostly...
    original_video = dask_image.imread.imread(video_path, nframes=16)

    # kwargs to use for generating both hog images and hog_descriptors
    # Values from original paper Dolensek et al 2020 in Science
    kwargs = dict(
        orientations=8,
        pixels_per_cell=(32, 32),
        cells_per_block=(1, 1),
        transform_sqrt=True,
        visualize=True
    )

    # Map the chunks of original video to as_grey function, basically drop
    # the color channel for each frame.
    grey_frames = original_video.map_blocks(as_grey, drop_axis=-1)

    # Define a meta for the shape of the dask arrays that are returned.
    # Both the images will be returned as a 3D numpy array.
    # Images: 2D images, several images per chunk
    # Descriptors: 2D histogram, several descriptors per chunk
    meta = np.array([[[]]])

    # Provide the datatype of the grayscaled images for writing dask arrays
    # later.
    dtype = grey_frames.dtype

    # Map the chunks of the grey_frames results onto the make_hogs() function.
    # Perform cropping, HOG calculations, and HOG image generation.
    my_hogs = grey_frames.map_blocks(
        make_hogs,
        coords=coords,
        dtype=dtype,
        meta=meta,
        kwargs=kwargs,
    )

    # first determine the output hog shapes from the first grey-scaled image so that
    # we can use them for all other images: 
    first_hog_descr, first_hog_image = make_hogs(
        grey_frames[:1, ...].compute(),
        coords,
        kwargs
    )

    # Get the resulting images out of the returned tuple from my_hogs() chunks. Save that
    # as a dask array of images.
    hog_images = my_hogs.map_blocks(
        get_ith_tuple_element,
        i=1,
        chunks=(grey_frames.chunks[0],) + first_hog_image.shape[1:],
        dtype=dtype,
        meta=meta
    )

    # Define shape of the HOG descriptor arrays
    descr_array_chunks = (grey_frames.chunks[0],) + first_hog_descr.shape[1:]

    # Dask needs consistent shapes for performing computations in the graph. Add
    # the needed dimensions for the arrays to be passed through the computations.
    if first_hog_descr.ndim <= first_hog_image.ndim:
        # we will recreate the missing hog_descriptor axes but give them each a size of 1
        new_axes = []
        n_missing_dims = first_hog_image.ndim - first_hog_descr.ndim
        descr_array_chunks += (1,)*n_missing_dims
    else:
        new_axes = list(range(first_hog_image.ndim, first_hog_descr.ndim))

    # Do not use `drop_axes` here!  `drop_axes` will attempt to concatenate the
    # tuples, which is undesirable.  Instead, use `squeeze()` later to drop the
    # unwanted axes.
    hog_descriptors = my_hogs.map_blocks(
        normalize_hog_desc_dims,
        new_axis=new_axes,
        chunks=descr_array_chunks,
        dtype=dtype,
        meta=meta,
    )

    # Drop the last unneeded dimension of the descriptors
    hog_descriptors = hog_descriptors.squeeze(-1)

    # Define Blosc as the compressor
    compressor = Blosc(cname='zstd', clevel=1)

    # Tell dask to perform computations via processes and not threads. Using threads yields severe
    # performance decreases! I don't know specifically why it struggles so much...
    with dask.config.set(scheduler='processes'):
        da.to_zarr(hog_images, "/path/to/data.zarr", component="images")
        da.to_zarr(hog_descriptors, "/path/to/data.zarr", component="descriptors")

    print("Data written to zarr! Hooray!")

    # On a powerful machine (64 Cores, Intel Xeon, 256GB RAM), approx. 44k frames complete in
    # 8 minutes!
    program_end = time.perf_counter() - program_start

    print(f"PROGRAM RUNTIME: {program_end}")

There are several more things that have to be done to this so it’s more part of a coherent system, but for this question in particular this set of code runs to completion!

For future updates, here’s where I’ll be trying to make a more complete set of tools for running the pipeline that the original lab had written for our lab.

2 Likes

Nice, that’s a great speed improvement!

1 Like

Okay, new update! It turns out that while this runs to completion, the data that’s output isn’t quite right! The HOG images and descriptors are full of 0s for some reason! Something is going wrong somewhere… gotta dive in and see where.

EDIT:

We solved it!

Gogolla Lab member @StoyoKaramihalev discovered a bug with the dask implementation of HOG calculations! The data that was written to disk was discovered to be entirely full of zeros! Not very helpful for doing facial expression analysis.

When you looked at the output of the script, you ended up with things like this:

@ParticularMiner and I spent some time this evening hacking out what the cause could be…

It turns out that the reason for this was in the current version’s code on line 194: The dtype was specified to be the same dtype as the gray_frames block.

We initially thought that since we were getting images out of the hog() function, we might as well stay consistent with the data type we were using, but we did this without double checking what the datatypes were of the returned objects from hog()! hog() returns objects as float64 and the original images are uint8. For reasons I don’t quite understand yet, converting from float64 to uint8 yields a 0 value for all the data in the returned objects! Perhaps hog() returns values that are all decimals or something?

If we instead specify the datatype to be float64 by stating dtype = first_hog_image.dtype, we retain the correct data type when all the downstream computations are performed by Dask!

If we specify dtype as the value from gray_frames, the returned value of the dask array will be converted to that type. See here in the docs.

This was tested by @ParticularMiner through the use of compute() on a small test video. The values of the data were all correct before being written to disk via to_zarr! So that clued us in that we were doing something incorrectly not with the make_hogs function but rather with our call to_zarr. I checked out some datatypes of the written zarr vs. the compute result that @ParticularMiner found and noticed they were different! When we tried keeping the datatype as float64 from the output of make_hogs, everything worked like it was supposed to!

Here’s the kind of result you can get now:

So, here’s the updated version that seems to work like it’s supposed to:

# Adapted repository for mouse facial expression analysis from Dolensek et al 2020
# Written with the gracious help of John Kirkham, Josh Moore, and Martin Durant (Dask/Zarr Developers)
# ParticularMiner, Jeremy Delahanty May 2022

import zarr
import numpy as np
import time
import warnings
# Import Histogram of Oriented Gradients (HOG) function for expression assessment
from skimage.feature import hog
# Use Blosc as compressor for zarrs written to disk
from numcodecs.blosc import Blosc
# Use PIMS for laziliy loading video files into numpy arrays
import pims
# Use dask.config to set scheduler as using processes
import dask.config
# Use dask.array for using to_zarr method later
import dask.array as da
# Use dask_image.imread to read in video data via pims and create dask arrays as a result
import dask_image.imread

# You could use this class that ParticularMiner wrote to read video into grayscale; left since it is
# informative on how to use custom PIMS readers.

# class GreyImageIOReader(FramesSequence):
#     class_priority = 100  # we purposefully set this very high to force pims.open() to use this reader
#     class_exts = pims.ImageIOReader.class_exts

#     def __init__(self, filename, **kwargs):
#         self._reader = pims.ImageIOReader(filename, **kwargs)
#         self.filename = self._reader.filename
    
#     def close(self):
#         self._reader.close()  # important since dask-image uses pims.open() as a context manager

#     def get_frame(self, n):
#         return pims.as_grey(self._reader.get_frame(n))

#     def __len__(self):
#         return self._reader.__len__()

#     @property
#     def frame_shape(self):
#         return self._reader.frame_shape[:-1]

#     @property
#     def pixel_type(self):
#         return self._reader.pixel_type

def as_grey(frame):
    """Convert a 2D image or array of 2D images to greyscale.
    This weights the color channels according to their typical
    response to white light.
    It does nothing if the input is already greyscale.
    (Copied and slightly modified from pims source code.)
    """

    if len(frame.shape) == 2 or frame.shape[-1] != 3:  # already greyscale
        return frame
    else:
        red = frame[..., 0]
        green = frame[..., 1]
        blue = frame[..., 2]
        return 0.2125 * red + 0.7154 * green + 0.0721 * blue

def make_hogs(frames, coords, kwargs):
    """
    Performs cropping and HOG generation upon chunk of frames.
    A chunk of frames received from dask_image.imread.imread is operated
    on with this function. The coordinates supplied are from cropping performed
    in a previous step on an example image and are used for cropping the chunk
    of frames supplied. The kwargs used define arguments that hog() expects.
    In this use case, both the HOG images and descriptors are returned.
    """

    # frames will be a chunk of elements from the dask array
    # coords are the cropping coordinates used for selecting subset of image
    # kwargs are the keyword arguments that hog() expects
    # Example:
    # kwargs = dict(
    # orientations=8,
    # pixels_per_cell=(32, 32),
    # cells_per_block=(1, 1),
    # transform_sqrt=True,
    # visualize=True
    # )

    # Perform cropping procedure upon every frame, the : slice,
    # crop the x coordinates in the second slice, and crop the y
    # coordinates in the third slice. Save this new array as 
    # new_frames
    new_frames = frames[
        :,
        coords[1]:coords[1] + coords[3],
        coords[0]:coords[0] + coords[2]
    ]

    # Get the number of frames and shape for making
    # np.arrays of hog descriptors and images later
    nframes = new_frames.shape[0]
    first_frame = new_frames[0]

    # Use first frame to generate hog descriptor np.array and
    # np.array of a hog image
    hog_descriptor, hog_image = hog(
        first_frame,
        **kwargs
        )

    # Make empty numpy array that equals the number of frames passed into
    # the function, use the fed in datatype as the datatype of the images
    # and descriptors, and make the arrays shaped as each object's shape
    hog_images = np.empty((nframes,) + hog_image.shape, dtype=hog_image.dtype)
    hog_descriptors = np.empty((nframes,) + hog_descriptor.shape, dtype=hog_descriptor.dtype)

    # Until I edit the hog code, perform the hog calculation upon each
    # frame in a loop and append them to their respective np arrays
    for index, image in enumerate(new_frames):
        hog_descriptor, hog_image = hog(image, **kwargs)
        hog_descriptors[index, ...] = hog_descriptor
        hog_images[index, ...] = hog_image
    
    return hog_descriptors, hog_images


def get_ith_tuple_element(tuple_, i=0):
    """
    Utility function for grabbing different data out of each
    returned tuple of numpy arrays from make_hogs().
    0 = images
    1 = descriptors
    """
    return  tuple_[i]


def normalize_hog_desc_dims(tuple_):
    # add more dimensions (each of length 1) to the hog descriptor chunk in
    # order to match the number of dimensions of the hog_image
    descriptor = tuple_[0]
    image = tuple_[1]
    if descriptor.ndim >= image.ndim:
        return tuple_[0]
    else:
        return  np.expand_dims(
            tuple_[0], axis=list(range(descriptor.ndim, image.ndim))
        )

# For each instance of PIMS opening the video it seems that it will warn you that
# the number of frames in  your video will not divide cleanly. You can ignore this
# warning.
warnings.filterwarnings('ignore', '`nframes` does not nicely divide')

# Wrap functions in a __name__ = "__main__" statement for allowing threads to spawn
# (fork in Linux case) properly. The reasons for this are still fuzzy to me...
if __name__ == "__main__":

    program_start = time.perf_counter()
    
    video_path = "path/to/video.mp4"

    pims.ImageIOReader.class_priority = 100  # we set this very high in order to force dask's imread() to use this reader [via pims.open()]

    # These coords are determined beforehand, should later be loaded from disk or have fiducial cropping performed so everything is the same
    coords = (496, 174, 992, 508)

    # Use dask_image's reader to read your frames via PIMS, change nframes according to
    # your chunksize, which will likely be determined by your machine's available RAM mostly...
    original_video = dask_image.imread.imread(video_path, nframes=16)

    # kwargs to use for generating both hog images and hog_descriptors
    # Values from original paper Dolensek et al 2020 in Science
    kwargs = dict(
        orientations=8,
        pixels_per_cell=(32, 32),
        cells_per_block=(1, 1),
        transform_sqrt=True,
        visualize=True
    )

    # Map the chunks of original video to as_grey function, basically drop
    # the color channel for each frame.
    grey_frames = original_video.map_blocks(as_grey, drop_axis=-1)

    # Define a meta for the shape of the dask arrays that are returned.
    # Both the images will be returned as a 3D numpy array.
    # Images: 2D images, several images per chunk
    # Descriptors: 2D histogram, several descriptors per chunk
    meta = np.array([[[]]])

    # first determine the output hog shapes from the first grey-scaled image so that
    # we can use them for all other outputs: 
    first_hog_descr, first_hog_image = make_hogs(
        grey_frames[:1, ...].compute(),
        coords,
        kwargs
    )

    # Provide the datatype of the hog images for writing dask arrays
    # later. If you use the datatype of the grey_frames, it will send all your
    # data to zero! Be sure to use the datatype that the function you're operating
    # with gives you unless it's safe to change it later!
    dtype = first_hog_image.dtype

    # Map the chunks of the grey_frames results onto the make_hogs() function.
    # Perform cropping, HOG calculations, and HOG image generation.
    my_hogs = grey_frames.map_blocks(
        make_hogs,
        coords=coords,
        dtype=dtype,
        meta=meta,
        kwargs=kwargs,
    )

    # Get the resulting images out of the returned tuple from my_hogs() chunks. Save that
    # as a dask array of images.
    hog_images = my_hogs.map_blocks(
        get_ith_tuple_element,
        i=1,
        chunks=(grey_frames.chunks[0],) + first_hog_image.shape[1:],
        dtype=dtype,
        meta=meta
    )

    # Define shape of the HOG descriptor arrays
    descr_array_chunks = (grey_frames.chunks[0],) + first_hog_descr.shape[1:]

    # Dask needs consistent shapes for performing computations in the graph. Add
    # the needed dimensions for the arrays to be passed through the computations.
    if first_hog_descr.ndim <= first_hog_image.ndim:
        # we will recreate the missing hog_descriptor axes but give them each a size of 1
        new_axes = []
        n_missing_dims = first_hog_image.ndim - first_hog_descr.ndim
        descr_array_chunks += (1,)*n_missing_dims
    else:
        new_axes = list(range(first_hog_image.ndim, first_hog_descr.ndim))

    # Do not use `drop_axes` here!  `drop_axes` will attempt to concatenate the
    # tuples, which is undesirable.  Instead, use `squeeze()` later to drop the
    # unwanted axes.
    hog_descriptors = my_hogs.map_blocks(
        normalize_hog_desc_dims,
        new_axis=new_axes,
        chunks=descr_array_chunks,
        dtype=first_hog_descriptor.dtype,
        meta=meta,
    )

    # Drop the last unneeded dimension of the descriptors
    hog_descriptors = hog_descriptors.squeeze(-1)

    # Define Blosc as the compressor
    compressor = Blosc(cname='zstd', clevel=1)

    # Tell dask to perform computations via processes and not threads. Using threads yields severe
    # performance decreases! I don't know specifically why it struggles so much...
    with dask.config.set(scheduler='processes'):
        da.to_zarr(hog_images, "path/to/data.zarr", component="images")
        da.to_zarr(hog_descriptors, "path/to/data.zarr", component="descriptors")

    print("Data written to zarr! Hooray!")

    # On a powerful machine (64 Cores, Intel Xeon, 256GB RAM), approx. 44k frames complete in
    # 8 minutes!
    program_end = time.perf_counter() - program_start

    print(f"PROGRAM RUNTIME: {program_end}")
1 Like

Yes that’s right, any array that is of datatype float will have values ranging between 0 and 1. A uint8 array has a range of possible pixel values between 0 and 255 (i.e. 2^8 different possible integers).

The conversion to integer datatype is rounding the pixel values to the nearest integer, which is zero or one (and it may not even be rounding the value carefully, but instead it’s entirely plausible that it’s using a floor function numpy.floor — NumPy v1.26 Manual so it’s speedier, and then ALL the values will get converted to zero).

Some libraries, like scikit-image, do some special handling in their conversion functions to stretch the range to the new minimum and maximum values before any information is lost in the datatype conversion. But they’re careful about it because they’re an image analysis library. Mostly you just get burned by it accidentally and maybe eventually learn to double check when you have a weird issue because you remember it happened before. I guess that last sentence covers a lot of software development, tbh