Dask-scratch-space

I’m launching some large workflows and I’ve noticed that my .dask/dask-scratch-space is getting sizable. When is it same to remove these files? can one do so as soon as the workers are lost? What are the scenarios where these files are not removed automatically?

Many thx for any guidance.

Hi @liberabaci,

The main reason I know of for this space to grow is lots of spilling to disk, so trying to load too big collections into memory. Depending on your deployment (do you use LocalCluster on one machine, something else?), you really might want to change that directory, to something like /tmp or /scratch. You might do this with Dask Configuration an the key temporary-directory, or by using a Kwarg when building your Cluster object.

These files are completely tied to a Worker. If the worker/cluster is stopped, you can remove them.

If the worker did not terminate gracefully, so generally in case of error.