Hi, I am trying to find more information on xxx.partd
files that are created by Dask in /tmp
folder. These folders are created in each run and accumulate over time creating unnecessary storage blockage. Unfortunately, I can’t find anything on it in docs or on github. Can I somehow manage them or clean them up automatically? I run Dask on Ray if that changes anything. Appreciate any hints
Hi @billwill thanks for the question! Dask uses partd during shuffling for dumping intermediate results to disk during larger-than-memory operations on a single machine. The tmp
directory should be automatically cleaned up. I don’t see any open issues on this either in the Ray or Dask projects, would you mind sharing a bit more on your setup or a minimal example?
2 Likes
Hi, I encountered recently the same problem with partd
files. I created an issue on Github, but it seems to be more related to Ray than Dask.
Link to the issue: [Bug] [Dask-on-Ray] Partd files are not cleaned automatically · Issue #8787 · dask/dask · GitHub
1 Like
Thank you @mmww! I also saw you posted this question on the Ray discourse, hopefully someone can help solve your problem there!
1 Like