Hi, I’m running dask distributed on an existing ECS cluster, and I’m running into issues with client.scatter() not distributing the workload evenly. Versions are 2022.7.1 on the workers, 2022.8.0 on the scheduler, and 2022.6.0 on the client.
I have 3 workers running, and the client sees this (len(client.nthreads())
is 3). Yet when I run
data = ['a', 'b', 'c']
future = client.scatter(data)
client.who_has()
I get
client.rebalance()
does nothing, and the only way I’ve found to get past this is to broadcast to all workers. But that’s causing issues with memory, since the real objects are quite large. Is there any way I can force scatter to send one item to each worker?