For example, I want to generate a random dask array like
def daskCustom():
da.random.random((10000, 10000, 10000), chunks=(100, 100, 100))
client.submit(daskCustom)
is it possible to distribute the load efficiently ? In my case I could see one of the worker node is being used heavily.
Please suggest standards to be followed
Let me know if more details required.
Dask should distribute load automatically. It’s possible that work stealing is interfering; it’s been known to make poor scheduling choices like this: Root-ish tasks all schedule onto one worker · Issue #6573 · dask/distributed · GitHub.
You could try disabling work stealing via the distributed.scheduler.work-stealing
config.
1 Like