Performing HOG Matrices on PIMS Chunks through ImageIO

Very good questions @jmdelahanty! It seems you’re getting the hang of this!

I’m very sorry for the confusion: I indeed made some typos in my previous code-snippet — that’s what happens when one writes untested code! I’m glad you figured out most of them though!

Since you have corrected almost everything already, I’ll just concentrate on those remaining. I’ve also corrected my posts above. If you find any more such typos, do let me know.

  • instead of

        for image in new_frames:
    

    it should have been:

        for i, image in enumerate(new_frames):
    
  • Ellipsis can be quite handy (see this link).

  • Regarding meta, I forgot to say something else: meta need not be precisely of the same type/form as the chunk type returned. Its value is only nominal. In fact, it can be used to spoof dask into expecting something other than what the chunk-function actually returns. But the one essential thing is that meta should have a .ndim attribute.

  • A tuple does not have a .ndim attribute, so it cannot be used as a meta. So better use an array for meta. In fact, the meta array does not even need to be of the right dimensionality. In your case, just use meta=np.array([[[]]]), a 3D numpy array.

  • You can specify the true expected dimensionality of the output chunks using other parameters of map_blocks(), namely new_axis (and/or drop_axis) (see this link). But if the dimensionality of the output is the same as that of the input, then there is no need to do anything, as map_blocks() assumes that by default.

    def get_ith_tuple_element(tuple_, i=0):
        return tuple_[i]
    
    meta = np.array([[[]]])
    dtype = grey_frames.dtype
    my_hogs = grey_frames.map_blocks(
        make_hogs, coordinates=coordinates, dtype=dtype, meta=meta
    )
    my_hogs = my_hogs.persist()
    
    # At this point, `dask` thinks `my_hogs` has the same shape as `grey_frames`.
    # It doesn't know that `my_hogs` has chunks of tuples of arrays.  As long as
    # you don't compute `my_hogs` directly, you can get away with this
    # inconsistency.
    
    hog_images = my_hogs.map_blocks(
        get_ith_tuple_element, i=1, dtype=dtype, meta=meta
    )
    
    # At this point, `hog_images` truly has the same shape as `grey_frames`.
    
    # In contrast, a hog-descriptor has a different shape from a hog-image, so
    # we need to let `map_blocks()` know what to expect.  We will first tell
    # `map_blocks()` to drop the image axes (1 and 2) that `dask` thinks
    # `my_hogs` has and next include the descriptor axes as new ones:
    image_axes = [1, 2]
    hog_descriptor, hog_image = hog(first_frame)
    descriptor_axes = list(range(1, hog_descriptor.ndim + 1))  # this gives [1, 2, ..., hog_descriptor.ndim]
    
    # It is probably best not to chunk along any descriptor axes; only chunk
    # along the first axis of the entire array of all descriptors:
    descriptors_array_chunks = (grey_frames.chunks[0][0],) + hog_descriptor.shape
    
    hog_descriptors = my_hogs.map_blocks(
        get_ith_tuple_element,
        i=0,
        drop_axis=image_axes,
        new_axis=descriptor_axes,
        chunks=descriptors_array_chunks,
        dtype=dtype,
        meta=meta,  # you can correct `meta` for consistency, if you want. But it really does not matter.
    )
    

After this, you can save your arrays to disk in a file format of your choice. Recall my previous post at this link.

I hope this helps.

1 Like