I see tree reduce mechanism implemented for dask arrays 1. I currently operate on dask Bags and I was curious if there are existing dask reduce implementations which allow for custom reduce operations ?
NOT A CONTRIBUTION
I see tree reduce mechanism implemented for dask arrays 1. I currently operate on dask Bags and I was curious if there are existing dask reduce implementations which allow for custom reduce operations ?
NOT A CONTRIBUTION
I guess this is one implementation : dask.bag.Bag.fold — Dask documentation
I was curious if there are existing benchmarks for different kinds of reduction topologies with dask datastructures.
NOT A CONTRIBUTION
So as you’ve seen, there are Map/Reduce like operations implemented in all the high level collections that Dask offers.
I don’t think there is any benchmark, but as Dask Bag is a collection made over Python object, this is generally considered as less optimized than Arrays or Dataframes which rely on Numpy and Pandas which are C optimized libraries under the hood.
These shuffle operations are expensive and better handled by projects like
dask.dataframe
. It is best to usedask.bag
to clean and process data, then transform it into an array or DataFrame before embarking on the more complex operations that require shuffle steps.
In this case I am mostly manipulating GPU hosted torch.Tensor
objects. They exist in this form as most of the transformations utilize torch’s cuda kernels which are not available in the cudf or other Rapids libraries. However for the reduce operation in question I can utilize cudf
operations.
Would you know if the overhead of transforming dask.Bag[torch.Tensors]
→ dask.Array[float]
, only for the reduce operation, is worth the speed gains? My experience with performing torch.Tensor
mini-batch processing in MapPartition
paying the conversion cost from dataframes → Tensor for every user-function makes me believe that the conversion overhead is high.
NOT A CONTRIBUTION
Sorry, I can’t tell at all.