I am trying to design a mechanism to distribute an asset (e.g. a model) across a large worker pool.
Directly providing the heavy object as part of the argument list to Map
operation is leading to significant delays as the scheduler seems to be distributing this to all the workers. Alternatives like centralized remote storage accessible to all the workers can quickly become the bottle necks for large scale worker pools.
I was curious if anyone has attempted mechanisms like distributing the object among the workers using a topology like a spanning tree, where the workers who received the asset can start serving it to other workers. If not would any of you have suggestions for implementing such a distribution mechanism.
NOT A CONTRIBUTION