Memory fluctuating but no tasks being processed

axelwang · December 3, 2022, 7:26am

On my dashboard, I am seeing the memory usage on the cluster and each individual worker fluctuating (< 20%), but no task is being processed. What could be some possible explanations of this?

Could this be caused by a .compute() or by client.scatter() on a large array? Currently I have

filtered_waves = filtered.compute()
filtered_da = da.from_array(filtered_waves,chunks=wave_on_slice_channel.chunks)
filtered_futures = client.scatter(filtered_da, broadcast=True)

I am fairly certain that the top .compute() is completed. I suspect very much that the code is stuck somewhere in between the second and third line, i.e. by the scatter.

However, before this section of code, I have done exactly the same with wave_on_slice_channel :

wave_future = client.scatter(wave_on_slice_channel,broadcast=True)

and wave_on_slice_channel as well as filtered_da have exactly the same shape and size (~ 11 GB).

My individual workers each have 100 GB and the cluster has > 2 TB of memory.

guillaumeeb · January 20, 2023, 1:01pm

Hi @axelwang,

If you have access to the Dashboard, you should be able to tell if this compute call has ended, don’t you? Another solution would be to execute step by step though Ipython or a Notebook.

But I agree with you, I suspect also the client.scatter call. I think this is normal that your seeing no tasks on the cluster during this call. I’m not sure what will be the result of broadcasting a Dask Array though. Why don’t you just broadcast the resulting filtered_waves array?

Another solution would be to persist() the dask array in memory, but it will be distributed among your cluster.

What are you doing next that needs a broadcasted Array?

Topic		Replies	Views
Manage garbage collection of Workers Distributed delayed , worker	10	279	July 13, 2024
Testing lazy evaluation of task graphs Distributed distributed	3	243	April 24, 2025
What's going on in `finalize` Distributed dask-array	12	735	July 20, 2022
Memory accumulation using client.map - how can I avoid this? Distributed client	7	1750	March 22, 2022
Optimising Dask computations (memory implications and communication overhead) Distributed delayed , future , distributed	6	298	October 12, 2023

Memory fluctuating but no tasks being processed

Related topics