Partd files clutter the /tmp

Hi, I am trying to find more information on xxx.partd files that are created by Dask in /tmp folder. These folders are created in each run and accumulate over time creating unnecessary storage blockage. Unfortunately, I can’t find anything on it in docs or on github. Can I somehow manage them or clean them up automatically? I run Dask on Ray if that changes anything. Appreciate any hints

Hi @billwill thanks for the question! Dask uses partd during shuffling for dumping intermediate results to disk during larger-than-memory operations on a single machine. The tmp directory should be automatically cleaned up. I don’t see any open issues on this either in the Ray or Dask projects, would you mind sharing a bit more on your setup or a minimal example?

2 Likes

Hi, I encountered recently the same problem with partd files. I created an issue on Github, but it seems to be more related to Ray than Dask.
Link to the issue: [Bug] [Dask-on-Ray] Partd files are not cleaned automatically · Issue #8787 · dask/dask · GitHub

1 Like

Thank you @mmww! I also saw you posted this question on the Ray discourse, hopefully someone can help solve your problem there!

1 Like