Hi, team,
Currently I use Dask as the distributed infrastructure to train AI model, and the working pattern is that the client sends the request to ask the workers to compute some metrics on the numpy.ndarray again and again, and the code for each round is same. All my programs is written by Python.
However, the memory warning
“distributed.worker - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS”
occurs after doing some rounds of compute.
I have called gc.collect on all worker by “client.run(gc.collect)” in the end of each round but it doesn’t resolve this issue.
My question, considering the memory continues to increase with round of compute going on, does this warning means that there are some code running on Dask worker, which keep the object referenced after each round of compute? In my mind, Python does GC so I don’t need to do “del” myself, and I don’t use any Dask.dataarray or Dask.dataframe.
Any comments is appreciated.
Chris Ding