Yup, same info as info/main/workers.html
, just without scraping
client.scheduler_info()
will (confusingly) give you all those worker metrics you’d see on the dashboard, including CPU. Testing locally, I see something like:
In [3]: client.scheduler_info()
Out[3]:
{'type': 'Scheduler',
'id': 'Scheduler-a81a9c65-8eaa-4164-8dc9-09016bb23574',
'address': 'tcp://127.0.0.1:62942',
'services': {'dashboard': 8787},
'started': 1641523098.2240808,
'workers': {'tcp://127.0.0.1:62953': {'type': 'Worker',
'id': 2,
'host': '127.0.0.1',
'resources': {},
'local_directory': '/Users/gabe/dev/dask/dask-worker-space/worker-1jw9np4_',
'name': 2,
'nthreads': 4,
'memory_limit': 8589934592,
'last_seen': 1641523105.8124008,
'services': {'dashboard': 62955},
'metrics': {'executing': 0,
'in_memory': 0,
'ready': 0,
'in_flight': 0,
'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}},
'spilled_nbytes': 0,
'cpu': 2.8,
'memory': 74620928,
'time': 1641523105.810308,
'read_bytes': 12294.81236837962,
'write_bytes': 18442.21855256943,
'read_bytes_disk': 0.0,
'write_bytes_disk': 0.0,
'num_fds': 30},
'nanny': 'tcp://127.0.0.1:62945'},
Notice that you can also find out how many tasks are actually executing (vs queued) on each worker too, if that’s interesting. Though note that these metrics are what workers report to the scheduler at regular intervals, so they’ll be slightly (milliseconds-seconds) out of date and may be slightly inconsistent with the scheduler’s task counts from Client.processing()
. But that may not matter for your use case, so you could probably just use scheduler_info()
and skip processing()
.
Is there an issue open for this? Having to hack around deadlocks via k8s is really not ideal