Forward worker logs for a cluster deployed via dask_yarn.YarnCluster

We have a cluster running via dask_yarn.YarnCluster but want to forward the worker logs to a particular files.

If we were to do this via a CLI , it will look like something like below :

dask worker tcp://192.0.0.100:8786 > worker.log

How can we foward the logs for each worker to a specified file, is there an environment variable we can set ?

I can see two ways in order to achieve this, not sure if they’ll work easily or at all though, as I’m not familiar with dask-yarn.

But a question first: in Yarn, isn’t there a way to retrieve logs for applications submitted already? Are you able to see the logs, but you just want them in another location?

What I would try are either of the following:

  • Build a custom Skein specification, and modify the Worker command to add the redirection.
  • Define a logging configuration in yaml files. However, I’m not sure if defining this on the client side would be enough, and you might have to push the file to Worker nodes for those to take it into account.