Disable the warning "distributed.utils_perf - WARNING - full garbage collections took 23% CPU time recently (threshold: 10%)"


I am trying to disable the below warning in my logs
“distributed.utils_perf - WARNING - full garbage collections took 23% CPU time recently (threshold: 10%)”
For this I referred to the solutions mentioned here

I am not able to find any yaml in the path ~/.config/dask/distributed.yaml

Another thing I tried is changing the config in python file but no luck

above command gives me an error

import dask
dask.config.set({'logging.distributed': 'error'})

Ultimaltely I am creating a Local Cluster using dask.distributed.LocalCluster

Hi @hjain371,

There is no file by default, it is up to you to create one. See this documentation for more information.

Could you share the stack trace of the Error?

Also, if you try within Python, but with a Distributed Scheduler instantiated outside of Python, then your Workers won’t be aware of that configuration.

See in Configuration — Dask documentation

Finally, note that persistent objects may acquire configuration settings when they are initialized. These settings may also be cached for performance reasons. This is particularly true for dask.distributed objects such as Client, Scheduler, Worker, and Nanny.

Basically the config don’t have any key named “logging”, reason may be I had not setup any distributed.yaml

I am setting the config variable at the top of py file at the time of calling the modules and creating the cluster in the same file using LocalCluster/EC2Cluster class

Yes, all the defaults values are not necessarily in the config at first. If you want to change these defaults, it’s your responsibility to add it though Python API or by creating a yaml file.

While setting the config with Python in the case of LocalCluster, which starts the distributed cluster from the same Python environment as your main thread, works, it will probably not be the case for EC2Cluster which starts Workers in a completely different environment (but I have to admit I’m not certain about this).

So I’m not sure what is the best practice to apply a custom distributed configuration to Workers in the case of dask-cloudprovider. Maybe @jacobtomlinson has something to say?

Thanks, I will try adding a yaml file to see if it works :slight_smile:
Is there is anyway I can just disable this partular warning and not the entire set of warnings

full garbage collections took 23% CPU time recently (threshold: 10%)

When you launch clusters with dask-cloudprovider your local config is synced to the workers. So any logging config you set locally should also apply to your EC2 workers.

1 Like

Hi @jacobtomlinson, is it documented somewhere? I wasn’t able to find that in the online documentation, but I may have missed something.

No I don’t think it’s explicitly documented anywhere

For this, you’ll have to implement some custom logging filtering.