Performing HOG Matrices on PIMS Chunks through ImageIO

Hello everyone! I’m in the process of trying to get something running through Dask and PIMS. We’re adapting code from a different lab (check out the published repo here) and working on getting it to run on videos. It was originally written to be run on a bunch of individual images.

I was told recently that Dask already uses PIMS for reading in videos which is very nice! However, I’m running into a couple problems that I haven’t been able to solve.

One of the first things I’ve tried to do is make a class that I can operate upon and later write out different parameters from the analysis to a configuration file. I called it Shoat because it’s apparently the name for a baby Hog and the purpose of this repository is to perform HOG calculations on cropped images. I thought that was funny. I’m not particularly good at making classes, so if you have any advice I’m happy to learn and take criticism!

Here’s what that whole thing looks like:

class Shoat:
    
    def __init__(self, video_path, pixels_per_cell, cells_per_block, 
                 orientations, transform_sqrt, nframes=60):
        
        self.video_path = video_path
        self.pixels_per_cell = pixels_per_cell
        self.cells_per_block = cells_per_block
        self.transform_sqrt = transform_sqrt
        self.orientations = orientations
        self.first_frame = None
        self.low_ycrop = None
        self.high_ycrop = None
        self.low_xcrop = None
        self.high_xcrop = None
        self.shape = None
        self.dtype = None
        self.nframes = nframes
    
    @property
    def get_cropped_coordinates(self):
        crop_coordinates = cv2.selectROI("ROI Selection", first_frame)
        cv2.destroyAllWindows()
        
        self.low_ycrop = crop_coordinates[1]
        self.low_xcrop = crop_coordinates[0]
        self.high_ycrop = crop_coordinates[3]
        self.high_xcrop = crop_coordinates[2]
        
    @property
    def get_video_metadata(self):
        
        with pims.ImageIOReader(self.video_path) as imgs:
            self.shape = (len(imgs),) + imgs.frame_shape
            self.dtype = np.dtype(imgs.pixel_type)
            self.first_frame = imgs.get_frame(0)
            
        
        
    def create_hogs(self):
        
        ar = dask_image.imread.imread(self.video_path, nframes=self.nframes)
        
        return ar

You’ll notice that Shoat doesn’t currently perform any HOG calculations or cropping of images. I’m running into a couple problems.

First, I’d like for my Dask reader to use ImageIO which seems like a nice package because it’s running ffmpeg under the hood. Since we have the framerate used to record the video, I feel like using ImageIO for reading data into a Dask Array would be a good choice. I’m still pretty stuck getting this working.

The first thing is that my videos are currently being encoded with color channels despite my camera only being capable of monochrome recordings. I’m not sure how to make scikit-image specify that I just want to encode my video as a series of grayscale images yet. Since the RGB channel doesn’t have any additional data, I thought the first thing that would be good to do would be to simply read the video with the as_grey method like this inside my create_hogs function:

def create_hogs(self):
        
        sfname = str(self.video_path)
        
        grayscale_video = pims.as_grey(pims.ImageIOReader(sfname))
        
        ar = dask_image.imread.imread(grayscale_video, self.nframes)
        
        return ar

Unfortunately when I do this, I run into this error that I’m stuck at:

Input In [236], in Shoat.create_hogs(self)
     42 strfname = str(self.video_path)
     44 grayscale_video = pims.as_grey(pims.ImageIOReader(strfname))
---> 46 ar = dask_image.imread.imread(grayscale_video, self.nframes)
     48 ar.shape
     51 return ar

File ~/miniconda3/envs/facial_expression/lib/python3.8/site-packages/dask_image/imread/__init__.py:47, in imread(fname, nframes, arraytype)
     44     import cupy
     45     arrayfunc = cupy.asanyarray
---> 47 with pims.open(sfname) as imgs:
     48     shape = (len(imgs),) + imgs.frame_shape
     49     dtype = np.dtype(imgs.pixel_type)

File ~/miniconda3/envs/facial_expression/lib/python3.8/site-packages/pims/api.py:186, in open(sequence, **kwargs)
    183 eligible_handlers = set(h for h in all_handlers
    184                         if ext and ext in map(_drop_dot, h.class_exts()))
    185 if len(eligible_handlers) < 1:
--> 186     raise UnknownFormatError(
    187         "Could not autodetect how to load a file of type {0}. "
    188         "Try manually "
    189         "specifying a loader class, e.g. Video({1})".format(ext, sequence))
    191 def sort_on_priority(handlers):
    192     # This uses optional priority information from subclasses
    193     # > 10 means that it will be used instead of than built-in subclasses
    194     def priority(cls):

UnknownFormatError: Could not autodetect how to load a file of type  original repr:
    <framessequencend>
    axes: 3
    axis 'x' size: 1280
    axis 'y' size: 1024
    axis 't' size: 45270
    pixel datatype: uint8. Try manually specifying a loader class, e.g. Video((ImageIOReader,) processed through proc_func. Original repr:
    <FramesSequenceND>
    Axes: 3
    Axis 'x' size: 1280
    Axis 'y' size: 1024
    Axis 't' size: 45270
    Pixel Datatype: uint8)

I’ve tried looking up how to solve these two errors but have come up short. For reference, my video is in the .avi format.

For the first exception it finds (UnknownFormatError: Could not autodetect how to load a file of type original repr:), I’m pretty confused because when I load it like this:

video = pims.as_grey(pims.ImageIOReader(self.image_path)

The video loads properly! Also when it loads the video in the get_video_metadata function, the package has no problem.

For the second exception, I’ve been looking over the docs and I can’t seem to find how do that manual specification in the documentation.

Once I can get things into a Dask array of grayscale images, I need to do 2 things:

  1. Crop each image
  2. Perform a HOG calculation on each image

I think this would be best done with the dask.delayed functionality, but I’ve never done that before so I’m not sure how to do it. The original code uses joblib and Parallel. I got that to work on the pims image reader, but it only did it one frame at a time. I’m pretty sure that Dask will do it in a way that’s a bit more controlled and probably faster. The coolest part will be that I can save both the HOG arrays and also their outputs as images to their own zarr arrays maybe and overlay them upon my original videos as one time series!

Any advice from Dask world?

I was going to edit my original post, but figured for posterity it would be better to leave it as is and update the post with a comment about something simple I discovered:

The reason these errors were appearing was because the imread functionality in Dask uses the from_array method. The grayscale_video gives this:

ImageIOReader,) processed through proc_func. Original repr:
    <FramesSequenceND>
    Axes: 3
    Axis 'x' size: 1280
    Axis 'y' size: 1024
    Axis 't' size: 45270
    Pixel Datatype: uint8

Dask’s imread doesn’t know how to process an ImageIOReader! So my new question is how can I get the from_array method to work on an ImageIOReader? I’ve stared for a while at the from_array documentation and the dask_image source and can’t seem to get my head around how I can get the grayscale video into a Dask array. I think once that’s complete, the next steps are probably pretty simple!

1 Like

@jmdelahanty

:grin: That’s the first I’ve heard of “shoat”. Cool. Looks like you’re having fun.

  1. dask_image.imread.imread() expects a string (filename or glob) as input. Your input is not a string, hence the exception.

  2. dask_image.imread.imread() uses pims.open() via dask_image.imread._utils._read_frame() to access image files. (See dask-image/dask_image/imread/_utils.py.)

  3. Since you want to use pims.ImageIOReader() followed by pims.as_grey() you would have to somehow get pims.open() to use ImageIOReader(). You can do this by increasing its class_priority attribute:

    import numpy as np
    import pims
    import dask_image.imread
    
    
    def as_grey(frame):
        """Convert a 2D image or array of 2D images to greyscale.
    
        This weights the color channels according to their typical
        response to white light.
    
        It does nothing if the input is already greyscale.
        (Copied and slightly modified from pims source code.)
        """
        if len(frame.shape) == 2 or frame.shape[-1] != 3:  # already greyscale
            return frame
        else:
            red = frame[..., 0]
            green = frame[..., 1]
            blue = frame[..., 2]
            return 0.2125 * red + 0.7154 * green + 0.0721 * blue
    
    
    pims.ImageIOReader.class_priority = 100  # we set this very high in order to force dask's imread() to use this reader [via pims.open()]
    rgb_frames = dask_image.imread.imread(r'/path/to/video/file.mpg')  # uses ImageIOReader
    grey_frames = rgb_frames.map_blocks(as_grey, drop_axis=-1, dtype=np.float_)
    
    Click to see an alternative but longer solution
    The PIMS docs detail how to create a custom reader (follow this link: [Custom Readers in PIMS](http://soft-matter.github.io/pims/v0.5/custom_readers.html#custom-readers)):
    import pims
    import dask_image.imread
    
    from pims import FramesSequence
    
    
    class GreyImageIOReader(FramesSequence):
        class_priority = 100  # we purposefully set this very high to force pims.open() to use this reader
        class_exts = pims.ImageIOReader.class_exts
    
        def __init__(self, filename, **kwargs):
            self._reader = pims.ImageIOReader(filename, **kwargs)
            self.filename = self._reader.filename
        
        def close(self):
            self._reader.close()  # important since dask-image uses pims.open() as a context manager
    
        def get_frame(self, n):
            return pims.as_grey(self._reader.get_frame(n))
    
        def __len__(self):
            return self._reader.__len__()
    
        @property
        def frame_shape(self):
            return self._reader.frame_shape[:-1]  # color channel has already been removed
    
        @property
        def pixel_type(self):
            return self._reader.pixel_type
    
    
    res = dask_image.imread.imread(r'/path/to/video/file.mpg')  # this will use GreyImageIOReader defined above
    

Regarding the remaining tasks you mentioned, I strongly recommend you read the docs on map_blocks() and play around with the examples there. I’d expect it to be possible to port the same functions you originally got working on the pims reader to map_blocks() with little to no additional effort.

1 Like

@jmdelahanty

Please note the simpler solution I’ve just edited into my previous post. I prefer this simpler method because it makes a more precise shape-determination of the images-array ahead of schedule [that is, before you call compute()]. The earlier solution does not make a precise determination of the shape of the images-array ahead of schedule.

I also added a close() method to GreyImageIOReader since dask-image will need this to close the reader.

1 Like

Thanks so much @ParticularMiner! I’m getting to play around with this now and, for the simpler example, it looks like things work great! If I try using the updated class, however, it doesn’t look like pims is listening to the class priority. When I try running it as you have written here, I end up with this:

image

It looks like it still retains the color channel, so I’m guessing I’m using it wrong.

The simple version does seem to work just right though! Here’s what I get when I run it nearly the same as you have. The only difference I did was to retain the datatype of the original image, uint8.

image

Do you notice any particular reason that the GreyImageIOReader isn’t getting used? When I use it directly, everything works perfect! I get this out of it when inspecting the object:

<Frames>
Length: 45270 frames
Frame Shape: 1024 x 1280
Pixel Datatype: uint8

I also need to look more into the color channel weights that you used here, I hadn’t seen that before!

One of the next things I’ve been trying to adjust is the chunksize of the resulting dask array. I’ve done this in other contexts, but here I can’t seem get the map_blocks method to make chunks of images properly. I had thought that by simply adding chunks=(32, 1, 1) would have made new chunks of images with the same x and y dimensions, but do computations on stacks of 32 frames. Am I misunderstanding how to assign chunks to my array?

@jmdelahanty

  • Ah, that was my mistake — I forgot to account for the removed color channel! Please note the following update to the GreyImageIOReader class:
           @property
           def frame_shape(self):
                return self._reader.frame_shape[:-1]  # color channel has been removed
    
    Sorry for the inconvenience.
  • rechunk() is used to change the chunksize of your array: see docs.
  • The color channel weights you see are directly copied and pasted from PIMS’s source-code for as_grey(). So I did not determine them. You can see for yourself:
2 Likes

Thank you so much for the references and update! Now both things work just as expected. We’re really close now!

So I’m testing now but I have no idea if I’m using the map_blocks functionality correctly. For now, I’m using the simpler method of doing things and here’s what it looks like:

import numpy as np
from skimage.feature import hog
import cv2
import pims
from pims import FramesSequence
import dask.array as da
import dask_image.imread

def as_grey(frame):
    """Convert a 2D image or array of 2D images to greyscale.

    This weights the color channels according to their typical
    response to white light.

    It does nothing if the input is already greyscale.
    (Copied and slightly modified from pims source code.)
    """
    if len(frame.shape) == 2 or frame.shape[-1] != 3:  # already greyscale
        return frame
    else:
        red = frame[..., 0]
        green = frame[..., 1]
        blue = frame[..., 2]
        return 0.2125 * red + 0.7154 * green + 0.0721 * blue


def crop_frame(video_object):

    coords = cv2.selectROI("ROI Selection", video_object)
    cv2.destroyAllWindows()

    return coords

def make_hogs(frame, coords):

    new_frame = frame[
        coords[1]:coords[1] + coords[3],
        coords[0]:coords[1] + coords[2]]

    hog_image, my_hog = hog(new_frame)
    
    return hog_image, my_hog

video_path = "path/to/video.avi"

pims.ImageIOReader.class_priority = 100  # we set this very high in order to force dask's imread() to use this reader [via pims.open()]

original_video = dask_image.imread.imread(video_path)

first_frame = np.array(original_video[0])

coordinates = crop_frame(first_frame)

grey_frames = original_video.map_blocks(as_grey, drop_axis=-1)

grey_frames = grey_frames.rechunk({0: 100})

my_hogs = grey_frames.map_blocks(make_hogs, coordinates, dtype=np.uint8)

my_hogs.compute()

My understanding is that once you call compute, Dask will generate a task graph (so turn the video to grayscale and then perform the HOG calculation) and then funnel it through the available resources. I’m wondering if there’s a way to output how long each task takes. The original pipeline that I’m adapting takes about 10 minutes to iterate over all the images and calculate HOGs. With this so far, it looks like it takes a fair bit longer (something like 30 minutes). I’m guessing that it’s taking longer because it’s reading from a video instead of individual jpegs which it can probably do a lot faster overall.

Finally, once this is finished running, I’ll end up with a new dask.array called my_hogs. One thing I would like to do is save the two outputs that come out of the HOG function: one is an image of the HOG output while the other is the actually computed gradients. I think it would be cool to have both of those pieces written to disk so I can do stuff with them later. Is there a way to specify saving multiple outputs from a given function in dask? I can’t seem to tell how to save two independent outputs from a function to their own arrays…

EDIT:

Looks like I’ve done something else wrong, I’m running into this error from pims:

<class 'pims.imageio_reader.ImageIOReader'> errored: Could not load meta information
=== stderr ===
ffmpeg version 4.2.2-static https://johnvansickle.com/ffmpeg/  Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 8 (Debian 8.3.0-6)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-debug --disable-ffplay --disable-indev=sndio --disable-outdev=sndio --cc=gcc --enable-fontconfig --enable-frei0r --enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librubberband --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi --enable-libzimg
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, avi, from '/snlkt/specialk_cs/2p/raw/CSE014/20211211/20211211_CSE014_plane1_-320.375.avi':
  Metadata:
    encoder         : Lavf58.45.100
  Duration: 00:25:09.00, start: 0.000000, bitrate: 4184 kb/s
    Stream #0:0: Video: mpeg4 (Simple Profile) (DIVX / 0x58564944), yuv420p, 1280x1024 [SAR 1:1 DAR 5:4], 4179 kb/s, 30 fps, 30 tbr, 30 tbn, 30 tbc
Stream mapping:
  Stream #0:0 -> #0:0 (mpeg4 (native) -> rawvideo (native))
Press [q] to stop, [?] for help
Output #0, image2pipe, to 'pipe:':
  Metadata:
    encoder         : Lavf58.29.100
    Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24, 1280x1024 [SAR 1:1 DAR 5:4], q=2-31, 943718 kb/s, 30 fps, 30 tbn, 30 tbc

@jmdelahanty

  • Hmm … it seems the error is not dask-related, so I can’t be sure from here what is causing it. Corrupted video file perhaps?

  • Correction: once you call compute() dask will execute the task-graph, not generate it. A task-graph is built any time you call a dask function like map_blocks() or imread().

  • Since the chunksize (number of images in a chunk) has been changed to 100, you must be aware that make_hogs() receives a chunk of 100 images at once as input. So you probably should loop hog() over all those images and stack-up all the processed images, since hog() is unable to process more than one image at a time (right?). Alternatively, you could rewrite hog()'s source code to do that (without looping), like I did for pims.as_grey(). In fact, I encourage you to try that and see if it boosts performance. In the meantime, here’s an untested code snippet:

def make_hogs(frames, coords):
    new_frames = frames[
        :,
        coords[1]:coords[1] + coords[3],
        coords[0]:coords[1] + coords[2],
    ]
	
	nframes = new_frames.shape[0]
	first_frame = new_frames[0]

    hog_descriptor, hog_image = hog(first_frame)
    hog_images = np.empty((nframes,) + hog_image.shape, dtype=frames.dtype)
    hog_descriptors = np.empty((nframes,) + hog_descriptor.shape, dtype=frames.dtype)

    for i, image in enumerate(new_frames):
        hog_descriptor, hog_image = hog(image)
        hog_descriptors[i, ...] = hog_descriptor
        hog_images[i, ...] = hog_image

    return hog_descriptors, hog_images

I do not know exactly what hog() does. I’ve simply assumed above that it returns a tuple of two arrays whose sizes remain constant over all the images in the chunk.

  • See this link for help on how to return more than one output.
2 Likes

I’m not sure what I did, but that error doesn’t appear to happen anymore! I think it’s because I’m running into a different thing first. See below.

Got it, that’s great to know. Thank you for the clarification!

That’s right! It’s supposed to just do the calculation on one image at a time, I hadn’t thought of that. Your looping solution looks like it should work! I’m going to try rewriting it as an exercise for myself and see if I can do it.

It returns 2 np.ndarrays in this case, just in case anyone who sees this is wondering. I can get this working on it’s own on a super small subset of images, but when I try running it as a dask.array I’m running into this error:

Traceback (most recent call last):
  File "dask_faces.py", line 105, in <module>
    descs, hoggers = make_hogs(original_video, coords, kwargs)
  File "dask_faces.py", line 76, in make_hogs
    hog_descriptor, hog_image = hog(first_frame)
  File "/home/jdelahanty/miniconda3/envs/facial_expression/lib/python3.8/site-packages/skimage/feature/_hog.py", line 198, in hog
    g_row = g_row_by_ch[rr, cc, idcs_max]
  File "/home/jdelahanty/miniconda3/envs/facial_expression/lib/python3.8/site-packages/dask/array/core.py", line 1884, in __getitem__
    self, index2 = slice_with_int_dask_array(self, index2)
  File "/home/jdelahanty/miniconda3/envs/facial_expression/lib/python3.8/site-packages/dask/array/slicing.py", line 1022, in slice_with_int_dask_array
    raise NotImplementedError("Don't yet support nd fancy indexing")
NotImplementedError: Don't yet support nd fancy indexing

I’ve been looking more into the source for the slicing module and the code for what hog does but I can’t seem to figure out how to move past it. Seems like my adventure with Dask for this project might end here! What do you think?

EDIT: I can just add into the for loop something like this:

image = np.array(image)

But I feel like that removes the advantages that dask gives me, right? It looks like it runs when I do that at least!

@jmdelahanty

It seems you misunderstood me. Are you calling hog(dask_array)? That’ll most likely not work, and that is not what I meant.

First of all, you can use hogs() as it is now, if you write make_hogs() in such a way that it loops over all images in a chunk (that’s the code snippet I posted above). Note that this does not mean you lose the advantages of parallelization. It just means that dask (by means of map_blocks) will automatically loop through the images in each chunk in parallel with other chunks thus achieving many-fold speed-up than when you’d loop without dask.

Secondly, it may be possible to further speed dask up by avoiding the loop in make_hogs(). But you can only do this if hog() is rewritten to handle a chunk of images rather than a single 2D image. You’ll have to check the source code of hog(). See:

Remember that a chunk in your case is a 3D numpy array; not a dask array!

1 Like

While I was driving home last night I realized that I somehow completely forgot map_blocks! I feel pretty silly for that one. So you’re totally right, calling hog(dask_array) doesn’t work. I’m going to run it with map_blocks now and see how it goes. Will update this post when I get there. Thanks for the clarification!

While it’s running, I’ll see if I can mess around with the hog function so it can take a 3D numpy array.

Okay, time for an edit:

I’m confused about a couple pieces here that I think is messing me up.

In the make_hogs function you amended above, you used the define keyword for defining the function. Is that from the attrs package? Should it be def instead?

When feeding in the arguments into make_hogs in the amended function, you exclude the variable coords. I was wondering if the function would be able to use the coordinates defined earlier on if it wasn’t passed into the function directly.

The way that I was thinking of writing it so the make_hogs function has everything it needs is like this. I added some comments for myself as well as some questions I had about how you wrote it:

def make_hogs(frames, coords, kwargs):

    # frames will be a chunk of elements from the dask array
    # coords are the cropping coordinates used for selecting subset of image
    # kwargs are the keyword arguments that hog() expects
    # Example:
    # kwargs = dict(
    # orientations=8,
    # pixels_per_cell=(32, 32),
    # cells_per_block=(1, 1),
    # transform_sqrt=True,
    # visualize=True
    # )

    # Perform cropping procedure upon every frame, the : slice,
    # crop the x coordinates in the second slice, and crop the y
    # coordinates in the third slice. Save this new array as 
    # new_frames
    new_frames = frames[
        :,
        coords[1]:coords[1] + coords[3],
        coords[0]:coords[0] + coords[2]
    ]

    # Get the number of frames and shape for making
    # np.arrays of hog descriptors and images later
    nframes = new_frames.shape[0]
    first_frame = new_frames[0]

    # Use first frame to generate hog descriptor np.array and
    # np.array of a hog image
    hog_descriptor, hog_image = hog(
        first_frame,
        **kwargs
        )

    # Make empty numpy array that equals the number of frames passed into
    # the function, use the fed in datatype as the datatype of the images
    # and descriptors, and make the arrays shaped as each object's shape
    # QUESTION ABOUT THIS PART BELOW
    hog_images = np.empty((nframes,) + hog_image.shape, dtype=frames.dtype)
    hog_descriptors = np.empty((nframes,) + hog_descriptor.shape, dtype=frames.dtype)

    # Until I edit the hog code, perform the hog calculation upon each
    # frame in a loop and append them to their respective np arrays
    for image in new_frames:
        hog_descriptor, hog_image = hog(image, **kwargs)

        # QUESTION: Use of ellipses here; this is something I've never
        # used before! Why do you use this instead of something like:
        # hog_descriptors.append(hog_descriptor)
        #
        # From what I've read online, this is a kind of self-referential
        # pointer in a list?
        #
        # QUESTION: I was also wondering about the use of 'i' here.
        # Is the use of 'i, ...' a fancy python technique? I haven't
        # seen this before either!
        hog_descriptors[i, ...] = hog_descriptor
        hog_images[i, ...] = hog_image

    return hog_descriptors, hog_images

My question about the hog_images and hog_descriptors in the block relates to the use of the meta parameter that you linked to above. I’m still wrapping my head around it, so forgive my butchering of its use!

If I understand correctly, the meta is effectively what you tell Dask to expect as your output. In this case, it’s two differently shaped numpy arrays.

meta = tuple(
     [
        np.array([ ], dtype=hog_image.dtype)[(np.newaxis,)*(hog_image.ndim - 1)],
        np.array([ ], dtype=hog_descriptor.dtype)[(np.newaxis,)*(hog_descriptor.ndim - 1)]
]

So I would effectively have Dask expect a tuple from make_hogs:

return (hog_descriptors, hog_images)

One additional thing:

To be clear here, the shapes of hog_descriptors and hog_images are different, but the shape of them individually is the same every time.

Very good questions @jmdelahanty! It seems you’re getting the hang of this!

I’m very sorry for the confusion: I indeed made some typos in my previous code-snippet — that’s what happens when one writes untested code! I’m glad you figured out most of them though!

Since you have corrected almost everything already, I’ll just concentrate on those remaining. I’ve also corrected my posts above. If you find any more such typos, do let me know.

  • instead of

        for image in new_frames:
    

    it should have been:

        for i, image in enumerate(new_frames):
    
  • Ellipsis can be quite handy (see this link).

  • Regarding meta, I forgot to say something else: meta need not be precisely of the same type/form as the chunk type returned. Its value is only nominal. In fact, it can be used to spoof dask into expecting something other than what the chunk-function actually returns. But the one essential thing is that meta should have a .ndim attribute.

  • A tuple does not have a .ndim attribute, so it cannot be used as a meta. So better use an array for meta. In fact, the meta array does not even need to be of the right dimensionality. In your case, just use meta=np.array([[[]]]), a 3D numpy array.

  • You can specify the true expected dimensionality of the output chunks using other parameters of map_blocks(), namely new_axis (and/or drop_axis) (see this link). But if the dimensionality of the output is the same as that of the input, then there is no need to do anything, as map_blocks() assumes that by default.

    def get_ith_tuple_element(tuple_, i=0):
        return tuple_[i]
    
    meta = np.array([[[]]])
    dtype = grey_frames.dtype
    my_hogs = grey_frames.map_blocks(
        make_hogs, coordinates=coordinates, dtype=dtype, meta=meta
    )
    my_hogs = my_hogs.persist()
    
    # At this point, `dask` thinks `my_hogs` has the same shape as `grey_frames`.
    # It doesn't know that `my_hogs` has chunks of tuples of arrays.  As long as
    # you don't compute `my_hogs` directly, you can get away with this
    # inconsistency.
    
    hog_images = my_hogs.map_blocks(
        get_ith_tuple_element, i=1, dtype=dtype, meta=meta
    )
    
    # At this point, `hog_images` truly has the same shape as `grey_frames`.
    
    # In contrast, a hog-descriptor has a different shape from a hog-image, so
    # we need to let `map_blocks()` know what to expect.  We will first tell
    # `map_blocks()` to drop the image axes (1 and 2) that `dask` thinks
    # `my_hogs` has and next include the descriptor axes as new ones:
    image_axes = [1, 2]
    hog_descriptor, hog_image = hog(first_frame)
    descriptor_axes = list(range(1, hog_descriptor.ndim + 1))  # this gives [1, 2, ..., hog_descriptor.ndim]
    
    # It is probably best not to chunk along any descriptor axes; only chunk
    # along the first axis of the entire array of all descriptors:
    descriptors_array_chunks = (grey_frames.chunks[0][0],) + hog_descriptor.shape
    
    hog_descriptors = my_hogs.map_blocks(
        get_ith_tuple_element,
        i=0,
        drop_axis=image_axes,
        new_axis=descriptor_axes,
        chunks=descriptors_array_chunks,
        dtype=dtype,
        meta=meta,  # you can correct `meta` for consistency, if you want. But it really does not matter.
    )
    

After this, you can save your arrays to disk in a file format of your choice. Recall my previous post at this link.

I hope this helps.

1 Like

I think this all actually runs @ParticularMiner ! It at least gets gets through the script I have on a small test video!

I need to try writing things to disk to make sure the outputs are what we expect, but it runs to completion through the map_blocks on both hog_images and hog_descriptors!

My understanding of things gest a little hazy even with all your helpful comments at the my_hogs.persist() piece. Have things actually been computed at the point of calling persist? I’m a little confused about the difference between persist and compute and when the actual work has been done. For example, when we get to the final block of code here that you’ve written, the hog_descriptors = my_hogs.map_blocks(... piece, has Dask actually executed all these different pieces? Or do you have to call compute on one of your arrays at the end so the graph Dask creates actually runs?

If things have actually been run and all that needs to happen is the final writing to disk piece which I know Dask does incredibly fast, it gets through my 30 minute videos in no time at all! That’s amazing!

@jmdelahanty

:+1: I’m glad things are working!

Ah yes. The difference between compute() and persist() can be hard to grasp at the beginning.

Ultimately, both persist() and compute() execute your chunk functions in the same way, but persist() returns a dask array while compute() returns a numpy array. What this means, in a distributed-memory computer cluster setting, is that persist() keeps the computed chunks separate and distributed among their respective workers (clients), while compute() concatenates the computed chunks into a single large numpy array on a single client (the chunks themselves vanish). Thus compute() can be prohibitive if the resulting array is a large one that cannot fit in the memory of a single client.

Truth be told, including persist() in your code is not necessary for the script to run to completion. But the script would run twice as slow than if persist() were present. This is because when you later execute both hog_images.compute() and hog_descriptors.compute(), and both attempt to access the chunks of my_hogs, make_hogs() will be ran twice for each chunk in my_hogs, that is, once per chunk for hog_images.compute(), and again once per chunk for hog_descriptors.compute().

In contrast, including persist() precomputes the chunks of my_hogs by calling make_hogs() once per chunk, with the computed chunks persisting in memory, so that they do not have to be computed again when accessed by either hog_images.compute() or hog_descriptors.compute().

Above, I have simply used hog_images.compute() and hog_descriptors.compute() to explain the dask dynamics in the absence/presence of the foregoing persist() statement. The same reasoning would apply to hog_images.persist() and hog_descriptors.persist(); or hog_images.to_hdf5() and hog_descriptors.to_hdf5(); or similar.

1 Like

Okay, I think I spoke too soon! It ran all the way through until it got to the actual data writing portion. I think that the reason it failed was because I didn’t actually call hog_images.compute() or hog_descriptors.compute()! I’m running that now to see.

So to clear up where I think I misunderstood, once I call persist() onto something, it will only execute to that point in the code block. So for subsequent map_blocks() calls should I call compute() on each of these pieces? And when I call compute, do I do the same thing as with persist? Namely:

hog_descriptors = hog_descriptors.compute()

hog_images = hog_images.compute()

I’m trying it out now in case it happens to work, but I wanted to double check about the understanding of when I need to specify the compute parameter.

@jmdelahanty

In principle, compute() should never be called, especially if it would result in an array that is too large to fit into memory. In your case, it would be better to save directly to disk (that is, without calling compute() at all). Use compute() only if you know your resulting array is small enough to fit into memory.

And no, compute() does not have to be used in the same way persist() was used.

2 Likes

Thank you for the clarification! Now that I’m not calling compute and am attempting writing things to disk directly I am so close I can taste the pork (get it, because of HOGs? Sorry…).

I’m running into a strange shape problem when writing to disk with this code:

def as_grey(frame):
    """Convert a 2D image or array of 2D images to greyscale.

    This weights the color channels according to their typical
    response to white light.

    It does nothing if the input is already greyscale.
    (Copied and slightly modified from pims source code.)
    """
    if len(frame.shape) == 2 or frame.shape[-1] != 3:  # already greyscale
        return frame
    else:
        red = frame[..., 0]
        green = frame[..., 1]
        blue = frame[..., 2]
        return 0.2125 * red + 0.7154 * green + 0.0721 * blue


def crop_frame(video_object):

    coords = cv2.selectROI("ROI Selection", video_object)
    cv2.destroyAllWindows()

    return coords

def make_hogs(frames, coords, kwargs):

    # frames will be a chunk of elements from the dask array
    # coords are the cropping coordinates used for selecting subset of image
    # kwargs are the keyword arguments that hog() expects
    # Example:
    # kwargs = dict(
    # orientations=8,
    # pixels_per_cell=(32, 32),
    # cells_per_block=(1, 1),
    # transform_sqrt=True,
    # visualize=True
    # )

    # Perform cropping procedure upon every frame, the : slice,
    # crop the x coordinates in the second slice, and crop the y
    # coordinates in the third slice. Save this new array as 
    # new_frames
    new_frames = frames[
        :,
        coords[1]:coords[1] + coords[3],
        coords[0]:coords[0] + coords[2]
    ]

    # Get the number of frames and shape for making
    # np.arrays of hog descriptors and images later
    nframes = new_frames.shape[0]
    first_frame = new_frames[0]

    print("NEW IMAGE SHAPE: ", first_frame.shape)

    # Use first frame to generate hog descriptor np.array and
    # np.array of a hog image
    hog_descriptor, hog_image = hog(
        first_frame,
        **kwargs
        )

    print("FIRST FRAME HOG SHAPE: ", hog_image.shape)

    # Make empty numpy array that equals the number of frames passed into
    # the function, use the fed in datatype as the datatype of the images
    # and descriptors, and make the arrays shaped as each object's shape
    hog_images = np.empty((nframes,) + hog_image.shape, dtype=hog_image.dtype)
    hog_descriptors = np.empty((nframes,) + hog_descriptor.shape, dtype=hog_descriptor.dtype)

    print("EMPTY HOG IMAGES NUMPY ARRAY SHAPE: ", hog_images.shape)

    # Until I edit the hog code, perform the hog calculation upon each
    # frame in a loop and append them to their respective np arrays
    for index, image in enumerate(new_frames):
        hog_descriptor, hog_image = hog(image, **kwargs)
        hog_descriptors[index, ...] = hog_descriptor
        hog_images[index, ...] = hog_image

    print("COMPUTED HOG IMAGE NUMPY ARRAY: ", hog_images.shape)
    
    return hog_descriptors, hog_images


def get_ith_tuple_element(tuple_, i=0):
    return  tuple_[i]


video_path = "test_vid.mp4"

pims.ImageIOReader.class_priority = 100  # we set this very high in order to force dask's imread() to use this reader [via pims.open()]

original_video = dask_image.imread.imread(video_path)

# Turn pims frame into numpy array that opencv will take for cropping image
coords = crop_frame(np.array(original_video[0]))

# kwargs to use for generating both hog images and hog_descriptors
kwargs = dict(
    orientations=8,
    pixels_per_cell=(32, 32),
    cells_per_block=(1, 1),
    transform_sqrt=True,
    visualize=True
    )

grey_frames = original_video.map_blocks(as_grey, drop_axis=-1)

grey_frames = grey_frames.rechunk({0: 100})

meta = np.array([[[]]])

dtype = grey_frames.dtype

my_hogs = grey_frames.map_blocks(
    make_hogs,
    coords=coords,
    kwargs=kwargs,
    dtype=dtype,
    meta=meta
    )

my_hogs = my_hogs.persist()

hog_images = my_hogs.map_blocks(
    get_ith_tuple_element,
    i = 1,
    dtype=dtype,
    meta=meta
)

# Redefine kwargs visualize flag so only descriptors are given
kwargs["visualize"] = False

image_axes = [1, 2]
hog_descriptor = hog(np.array(grey_frames[0]), **kwargs)
descriptor_axes = list(range(1, hog_descriptor.ndim + 1))

descriptors_array_chunks = (grey_frames.chunks[0][0],) + hog_descriptor.shape

hog_descriptors = my_hogs.map_blocks(
    get_ith_tuple_element,
    i=0,
    drop_axis=image_axes,
    new_axis=descriptor_axes,
    chunks=descriptors_array_chunks,
    dtype=dtype,
    meta=meta
)

print("DESCRIPTORS: ", type(hog_descriptors), hog_descriptors.shape, hog_descriptors)
print("IMAGES: ", type(hog_images), hog_images.shape, hog_images)

compressor = Blosc(cname='zstd', clevel=1)

da.to_zarr(hog_images, "hog_images/data.zarr", compressor=compressor)

da.to_zarr(hog_descriptors, "hog_descriptors/data.zarr", compressor=compressor)

print("Data written to zarr! Hooray!")

Here’s what I’m getting as output to my terminal for the shapes of different things:

NEW IMAGE SHAPE:  (495, 859)                                                                                   
FIRST FRAME HOG SHAPE:  (495, 859)
EMPTY HOG IMAGES NUMPY ARRAY SHAPE:  (100, 495, 859)
COMPUTED HOG IMAGE NUMPY ARRAY:  (100, 495, 859)
DESCRIPTORS:  <class 'dask.array.core.Array'> (900, 15840) dask.array<get_ith_tuple_element, shape=(900, 15840), dtyp$
=uint8, chunksize=(100, 15840), chunktype=numpy.ndarray>
IMAGES:  <class 'dask.array.core.Array'> (851, 1080, 1920) dask.array<get_ith_tuple_element, shape=(851, 1080, 1920),
dtype=uint8, chunksize=(100, 1080, 1920), chunktype=numpy.ndarray>

For some reason the shape of the computed arrays aren’t being retained somewhere and after testing for a while I can’t seem to find where it’s inheriting the shape of the original dataset’s frames. When trying to write to disk, this error appears:

    check_array_shape('value', value, sel_shape)
  File "/home/jdelahanty/miniconda3/envs/facial_expression/lib/python3.8/site-packages/zarr/util.py", line 535, in ch$
ck_array_shape
    raise ValueError('parameter {!r}: expected array with shape {!r}, got {!r}'
ValueError: parameter 'value': expected array with shape (100, 1080, 1920), got (100, 495, 859)

So the to_zarr call received the correct shape of the arrays, but it’s using the original size of the frames. Anything obvious you see that’s making this happen?

EDIT:

It seems like the problem is coming from the map_blocks call to the get_ith_tuple_element maybe? Since the images dask.array there also has the wrong size there, but I don’t know how it’s getting messed up. It looks like the only thing that’s inherited there is the dtype value from earlier which is grey_frames.dtype but I’m not sure how that would cause an issue…

@jmdelahanty

Oops. The fact that the cropping was changing the shape of the array had completely escaped me! But of course, that’s exactly what cropping means! Sorry, I hadn’t read your code carefully enough.

So here you have a situation where the map_blocks(make_hogs, ...) call has an input chunk-function make_hogs() that receives a batch of greyscale images of a particular size but outputs the batch with the images changed to a different size but the same dimensionality. Remember that map_blocks() has no idea this shape-change will take place, because the computation is delayed. In fact, it assumes the output of make_hogs() has the same shape as its input.

One way to amend this situation is, like you did for hog_descriptors = my_hogs.map_blocks(), you also need to inform the hog_images = my_hogs.map_blocks() call that get_ith_tuple_element()'s output will change. To do this you pass the chunks parameter to map_blocks().

Thus, you may change the following code-block in your script:

hog_images = my_hogs.map_blocks(
    get_ith_tuple_element,
    i = 1,
    dtype=dtype,
    meta=meta
)

# Redefine kwargs visualize flag so only descriptors are given
kwargs["visualize"] = False

image_axes = [1, 2]
hog_descriptor = hog(np.array(grey_frames[0]), **kwargs)
descriptor_axes = list(range(1, hog_descriptor.ndim + 1))

descriptors_array_chunks = (grey_frames.chunks[0][0],) + hog_descriptor.shape

to this:

# first determine the output hog shapes from the first grey-scaled image so that
# we can use them for all other images: 
first_hog_descriptor, first_hog_image = make_hogs(
    grey_frames[:1, ...].compute(), coords, **kwargs
)

hog_images = my_hogs.map_blocks(
    get_ith_tuple_element,
    i = 1,
	chunks=(grey_frames.chunks[0][0],) + first_hog_image.shape[1:]
    dtype=dtype,
    meta=meta
)

image_axes = [1, 2]
descriptor_axes = list(range(1, first_hog_descriptor.ndim))
descriptors_array_chunks = (grey_frames.chunks[0][0],) + first_hog_descriptor.shape[1:]

This could be the only necessary change you need to make, as far as I can see.

If it still doesn’t work, please send me the full error traceback log.

1 Like

@jmdelahanty

Please note the edit in my last post.

1 Like

This is absurdly close to working! I’ve run into another small issue here related to the fact that not all my chunks are the same size when writing to disk. For example, in the test_vid.mp4, I have 851 frames total. This frame count isn’t easily divisible by anything and my inputs to this won’t be guaranteed to be divisible by reasonable chunk sizes either. For now there’s chunks of 100 results each. Since the final chunk has a size of 51 for the first axis, zarr doesn’t seem to handle the different sized chunk at the end!

Here’s the error message I get:

  File "/home/jdelahanty/miniconda3/envs/facial_expression/lib/python3.8/site-packages/zarr/util.py", line 535, in check_array_shape
    raise ValueError('parameter {!r}: expected array with shape {!r}, got {!r}'
ValueError: parameter 'value': expected array with shape (100, 524, 911), got (51, 524, 911)

In some previous experiences writing to zarr using tifffile and dask, the final chunk was allowed to have a different chunk size than the rest from what I remember. I can’t seem to get this to happen here in this case. Any advice for that?