SSHCluster User Generated Logs

Hi Dask Community!

We have set up a cluster of multiple machines and use SSHCluster to run tasks with dask. We also use the python logging standard module to generate logs into log files. As tasks are distributed between workers in the machines, dask workers are designed to output logs on separate machines. Is there a recommended solution to merge logs from different machines into one (preferably the scheduler machine)?

Hi @stepanyanhayk, welcome to this forum!

So you are configuring a specific file for your logs, you’re not printing them along with other default Dask worker logs?

If so, I guess the Client.get_worker_logs function is of no use to you?

And so you’ll probably have to write your own solution to do this. Either outside of Dask, using infrastructure such as shared NFS or something more complex like syslog or Elastic stack for centralizing things, either inside of Dask, using some specific code, maybe using Bag, to read and merge all the logs.

Hi @guillaumeeb, thank you very much for your reply!

Outputting into a file is not mandatory, I can also print them out. However, the problem is that the workers are in different physical machines. Is there a way to print user-generated logs along the default Dask worker logs so I can then get them with client.get_worker_logs?

Thank you,
Hayk