Disable dask worker logs from printing on console and write to file

When running computation heavy tasks there is a continuous printing of INFO_GarbageCollection logs and warnings from workers. This is a barrier for other console logs used for either debugging or task status. Examples below,

dask_worker_2        | distributed.utils_perf - INFO - full garbage collection released 34.77 MiB from 337 reference cycles (threshold: 9.54 MiB)
dask_worker_2        | distributed.utils_perf - WARNING - full garbage collections took 23% CPU time recently (threshold: 10%)

Here is what I tried (without success):

  1. disable logging using;

    import logging
    from distributed.worker import logger
    logging.disable(logging.WARNING)
    logger.warning('ignore') 
    
  2. Updating distributed.yaml to increase the time to probe for warnings;

      admin:
        tick:
          interval: 60s #20ms  # time between event loop health checks
          limit: 300s #3s       # time allowed before triggering a warning
    
    distributed:
      version: 2
      # logging:
      #   distributed: info
      #   distributed.client: warning
    
  3. Read through logging source code, wondering if distributed.worker file should be updated.

It would be useful to write console output to a file. apart from disabling the warnings. I tried setting this up from .py file as per reference

#Write logs to disk
logging_config = {
    "version": 1,
    "handlers": {
        "file": {
            "class": "logging.handlers.RotatingFileHandler",
            "filename": "consoleLogs.log",
            "level": "INFO",
        },
        "console": {
            "class": "logging.StreamHandler",
            "level": "INFO",
        }
    },
    "loggers": {
        "distributed.worker": {
            "level": "INFO",
            "handlers": ["file", "console"],
        },
        "distributed.scheduler": {
            "level": "INFO",
            "handlers": ["file", "console"],
        }
    }
}
dask.config.config['logging'] = logging_config

I was also wondering if the warning or INFO message can be logged once (maybe with a counter for occurrences). I understand that the message string includes specific information such as CPU% that maybe dynamic, but as a user all I need to know is if the CPU utilization is above my set threshold.

I also tried reference

        logger = logging.getLogger("distributed.utils_perf")
        logger.disabled
        logger.setLevel(logging.ERROR)

Hi @SOUMYASHUKLA, thanks for the question! To start, Distributed uses Python’s standard logging module and you can always check the current configuration settings using dask.config.get. To answer your questions:

To only print error messages (see this similar discourse question), you can change the logging level in ~/.config/dask/*.yaml (see controlling logging via a config file). For example, your ~/.config/dask/distributed.yaml file could look like:

logging:
  distributed: error

If you prefer to temporarily set this directly within Python you can use:

import dask
dask.config.set({'logging.distributed': 'error'})

In the snippet you shared from the issue for adding documentation on saving logs to a file, you can change the logging level such that only errors are saved and/or printed to the console, as shown above.

I was also wondering if the warning or INFO message can be logged once (maybe with a counter for occurrences).

I would recommend controlling this using a filter. There are examples on how to do this in the logging-cookbook, and then you’d change the config file to something like this, where custom_filter_class is defined in a separate Python file:

logging:
  version: 1
  handlers:
    file:
      class: logging.handlers.RotatingFileHandler
      filename: output.log
      level: INFO
      filters: [custom_filter_class]
  loggers:
    distributed:
      level: INFO
      handlers:
        - file
3 Likes

@scharlottej13 Thank you for the details. I was able to disable the warnings by editing inside the correct docker container using the following code :slight_smile:

distributed:
  version: 2
  logging:
    distributed: error
    distributed.client: error
    distributed.worker: error
2 Likes