Search for deadlock cause | freeze upon data read; distributed; other?

Correction: 2023.10.0.

Another deadlock occurred and it did not happen around a data read. I think that this is weakening this hypothesis.

I am pasting below the last few seconds of all client-side distributed logs below (32k character post limit). I can share full logs if needed.

The logs show that completed tasks reach the released state but never reach the forgotten state. I believe that it is not normal? My local cluster might be up for 24 hours, runs more than 1e6 tasks, and no task ever gets forgotten?

Some notes maybe related to that:

  1. most of my tasks are launched from other tasks. I always use a worker_client context manager with the recommended scheme:
with worker_client(separate_thread=True) as client:
	future = client.submit(task)
	client.gather(future)
  1. I am still experiencing this issue. Already gathered tasks get randomly re-run in large batches several hours later, unless I use this workaround: cancelling root tasks upon gathering them. I call “root” tasks the original ones in the context of launching tasks from tasks. Is this cancel-upon-gathering workaround a problem?

  2. Almost all my root tasks are bound methods of dataclass instances.

  3. Many of the tasks read data from a zarr-formatted xarray dataset. It can happen that tasks read the same data. Could this cause a dependency issue among tasks that is preventing them to reach the forgotten state?
    In one of the tasks that gets run 20x/minute, I switched to using chunks=None to avoid dask in xarray.open_zarr. But I cannot do this in other tasks where I need dask to load more data in many chunks.

Last few seconds of distributed logs before deadlock
DEBUG 2024-08-20T22:13:09.879 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:09.928 | distributed.core | Message from 'tcp://127.0.0.1:52831': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:51874', 'now': 1724206389.9142122, 'metrics': {'task_counts': {}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'latency': 0.00785517692565918, 'tick-duration': 0.5060315132141113}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 239, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 6347}, 'event_loop_interval': 0.020789613326390583, 'cpu': 0.0, 'memory': 770019328, 'time': 1724206389.3883238, 'host_net_io': {'read_bps': 187185.69774082265, 'write_bps': 5075574.697301888}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 100856.97151691602}}, 'executing': {}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:09.930 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:09.951 | distributed.core | Message from 'tcp://127.0.0.1:52993': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:52968', 'now': 1724206389.9348505, 'metrics': {'task_counts': {}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'tick-duration': 0.5160410404205322, 'latency': 0.029094219207763672}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 189, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 3306}, 'event_loop_interval': 0.020760964374152983, 'cpu': 0.0, 'memory': 673681408, 'time': 1724206389.4199903, 'host_net_io': {'read_bps': 169312.66657718315, 'write_bps': 5147930.481073159}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 101469.09946771116}}, 'executing': {}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:09.955 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:09.979 | distributed.core | Message from 'tcp://127.0.0.1:52032': {'op': 'identity', 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:09.980 | distributed.core | Calling into handler identity | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.013 | distributed.core | Message from 'tcp://127.0.0.1:50707': {'op': 'identity', 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.015 | distributed.core | Calling into handler identity | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.028 | distributed.core | Message from 'tcp://127.0.0.1:54122': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:54089', 'now': 1724206390.0254462, 'metrics': {'task_counts': {}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'latency': 0.020717382431030273, 'tick-duration': 0.5045604705810547}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 0, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 59}, 'event_loop_interval': 0.020172195434570314, 'cpu': 9.1, 'memory': 421269504, 'time': 1724206389.522894, 'host_net_io': {'read_bps': 171754.8490986988, 'write_bps': 5202476.753975689}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 0.0}}, 'executing': {}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.030 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.079 | distributed.core | Message from 'tcp://127.0.0.1:52804': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:52800', 'now': 1724206390.054684, 'metrics': {'task_counts': {'executing': 1}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'profile-duration': 0.007178544998168945, 'latency': 0.016718626022338867, 'tick-duration': 0.4435253143310547}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 258, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 3616}, 'event_loop_interval': 0.02072794096810477, 'cpu': 11.7, 'memory': 657149952, 'time': 1724206389.5803127, 'host_net_io': {'read_bps': 167238.9418620977, 'write_bps': 5098746.718282465}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 0.0}}, 'executing': {'_run-e3e39d0c8564d22315971f56ffbedd69': 1.25016188621521}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.081 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.106 | distributed.core | Message from 'tcp://127.0.0.1:53547': {'op': 'identity', 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.106 | distributed.core | Calling into handler identity | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.218 | distributed.core | Message from 'tcp://127.0.0.1:53526': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:53523', 'now': 1724206390.204076, 'metrics': {'task_counts': {}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'tick-duration': 0.5164482593536377, 'latency': 0.02159428596496582}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 33, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 1574}, 'event_loop_interval': 0.020813201864560444, 'cpu': 0.0, 'memory': 698822656, 'time': 1724206390.194567, 'host_net_io': {'read_bps': 140642.99157572613, 'write_bps': 4414243.372120053}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 0.0}}, 'executing': {}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.218 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.221 | distributed.core | Message from 'tcp://127.0.0.1:54123': {'op': 'identity', 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.221 | distributed.core | Calling into handler identity | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.262 | distributed.core | Message from 'tcp://127.0.0.1:52041': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:52029', 'now': 1724206390.2301447, 'metrics': {'task_counts': {}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'tick-duration': 0.4809408187866211, 'latency': 0.04036593437194824}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 366, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 5920}, 'event_loop_interval': 0.02025573861365225, 'cpu': 6.2, 'memory': 794079232, 'time': 1724206389.7502222, 'host_net_io': {'read_bps': 170728.9686614519, 'write_bps': 5207211.566705583}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 0.0}}, 'executing': {}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.262 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.361 | distributed.core | Message from 'tcp://127.0.0.1:52452': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:52449', 'now': 1724206390.34587, 'metrics': {'task_counts': {}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'tick-duration': 0.5130734443664551, 'latency': 0.007166147232055664}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 350, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 4496}, 'event_loop_interval': 0.019679331311992572, 'cpu': 3.2, 'memory': 710520832, 'time': 1724206390.3271322, 'host_net_io': {'read_bps': 144173.60316119067, 'write_bps': 4599621.29790883}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 0.0}}, 'executing': {}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.362 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.378 | distributed.core | Message from 'tcp://127.0.0.1:52542': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:52539', 'now': 1724206390.3692362, 'metrics': {'task_counts': {'long-running': 1}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'tick-duration': 0.49875473976135254, 'profile-duration': 0.0038979053497314453, 'latency': 0.021382808685302734}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 343, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 4284}, 'event_loop_interval': 0.021115181293893368, 'cpu': 39.4, 'memory': 2138636288, 'time': 1724206389.8718908, 'host_net_io': {'read_bps': 164541.58734773673, 'write_bps': 5049403.500779633}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 0.0}}, 'executing': {'_run-4afda04ff3bdaa754bd5b62dbef06233': 215.26723909378052}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.379 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.432 | distributed.core | Message from 'tcp://127.0.0.1:51918': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:51874', 'now': 1724206390.4176645, 'metrics': {'task_counts': {}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'tick-duration': 0.48685312271118164, 'latency': 0.01943182945251465}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 239, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 6347}, 'event_loop_interval': 0.020789613326390583, 'cpu': 3.0, 'memory': 770019328, 'time': 1724206389.9151478, 'host_net_io': {'read_bps': 165026.49364236387, 'write_bps': 5118670.449071058}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 0.0}}, 'executing': {}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.434 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.461 | distributed.core | Message from 'tcp://127.0.0.1:54092': {'op': 'identity', 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.463 | distributed.core | Calling into handler identity | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.468 | distributed.core | Message from 'tcp://127.0.0.1:54106': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:52968', 'now': 1724206390.4482503, 'metrics': {'task_counts': {}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'latency': 0.01152801513671875, 'tick-duration': 0.4828603267669678}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 189, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 3306}, 'event_loop_interval': 0.020760964374152983, 'cpu': 3.0, 'memory': 673681408, 'time': 1724206389.9359024, 'host_net_io': {'read_bps': 154627.35551608173, 'write_bps': 4815533.21277468}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 0.0}}, 'executing': {}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.469 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.510 | distributed.core | Message from 'tcp://127.0.0.1:54122': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:54089', 'now': 1724206390.5073721, 'metrics': {'task_counts': {}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'tick-duration': 0.5123684406280518, 'latency': 0.011069536209106445}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 0, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 59}, 'event_loop_interval': 0.019717864990234375, 'cpu': 0.0, 'memory': 421269504, 'time': 1724206390.0262668, 'host_net_io': {'read_bps': 154490.14933141595, 'write_bps': 4831854.936692853}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 0.0}}, 'executing': {}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.511 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.592 | distributed.core | Message from 'tcp://127.0.0.1:52848': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:52800', 'now': 1724206390.5690987, 'metrics': {'task_counts': {'executing': 1}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'tick-duration': 0.5274598598480225, 'latency': 0.020902633666992188, 'profile-duration': 0.007411956787109375}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 258, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 3616}, 'event_loop_interval': 0.020236516485408862, 'cpu': 3.1, 'memory': 657149952, 'time': 1724206390.5497952, 'host_net_io': {'read_bps': 144144.83365835727, 'write_bps': 4573182.1889181845}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 3695757.637621699}}, 'executing': {'_run-e3e39d0c8564d22315971f56ffbedd69': 1.7645766735076904}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.592 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.603 | distributed.core | Message from 'tcp://127.0.0.1:52971': {'op': 'identity', 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.606 | distributed.core | Calling into handler identity | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.610 | distributed.core | Message from 'tcp://127.0.0.1:50649': {'op': 'set_metadata', 'keys': ('cluster-manager-info',), 'value': {'name': 'process', 'type': 'distributed.deploy.local.LocalCluster'}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.612 | distributed.core | Calling into handler set_metadata | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.657 | distributed.core | Message from 'tcp://127.0.0.1:52478': {'op': 'identity', 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.660 | distributed.core | Calling into handler identity | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.724 | distributed.core | Message from 'tcp://127.0.0.1:53547': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:53523', 'now': 1724206390.711933, 'metrics': {'task_counts': {}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'tick-duration': 0.4953775405883789, 'latency': 0.006109476089477539}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 33, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 1574}, 'event_loop_interval': 0.02002980662327187, 'cpu': 3.1, 'memory': 698822656, 'time': 1724206390.6897569, 'host_net_io': {'read_bps': 164298.5520704387, 'write_bps': 5124608.736025628}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 3686019.507652876}}, 'executing': {}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.725 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.797 | distributed.core | Message from 'tcp://127.0.0.1:52476': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:52029', 'now': 1724206390.7602613, 'metrics': {'task_counts': {}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'tick-duration': 0.47981858253479004, 'latency': 0.0326840877532959}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 366, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 5920}, 'event_loop_interval': 0.02045477197525349, 'cpu': 0.0, 'memory': 794079232, 'time': 1724206390.7401502, 'host_net_io': {'read_bps': 160104.912050841, 'write_bps': 4938728.885129144}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 3588715.662984565}}, 'executing': {}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.797 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.842 | distributed.core | Message from 'tcp://127.0.0.1:52477': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:52449', 'now': 1724206390.8379126, 'metrics': {'task_counts': {}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'tick-duration': 0.492218017578125, 'latency': 0.015117645263671875}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 350, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 4496}, 'event_loop_interval': 0.019679331311992572, 'cpu': 0.0, 'memory': 710520832, 'time': 1724206390.8192666, 'host_net_io': {'read_bps': 160738.64390191913, 'write_bps': 4944908.330893248}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 3708906.288367296}}, 'executing': {}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.844 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.863 | distributed.core | Message from 'tcp://127.0.0.1:52596': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:52539', 'now': 1724206390.8530564, 'metrics': {'task_counts': {'long-running': 1}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'tick-duration': 0.5076727867126465, 'profile-duration': 0.0048067569732666016, 'latency': 0.009158134460449219}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 343, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 4284}, 'event_loop_interval': 0.021115181293893368, 'cpu': 38.7, 'memory': 2139447296, 'time': 1724206390.8461154, 'host_net_io': {'read_bps': 159124.6745864264, 'write_bps': 4883870.184333095}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 189306.7882544334}}, 'executing': {'_run-4afda04ff3bdaa754bd5b62dbef06233': 215.75105929374695}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.865 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.886 | distributed.core | Message from 'tcp://127.0.0.1:51877': {'op': 'identity', 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.887 | distributed.core | Calling into handler identity | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.918 | distributed.core | Message from 'tcp://127.0.0.1:52831': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:51874', 'now': 1724206390.9042845, 'metrics': {'task_counts': {}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'tick-duration': 0.523423433303833, 'latency': 0.021486997604370117}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 239, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 6347}, 'event_loop_interval': 0.02021289825439453, 'cpu': 0.0, 'memory': 770019328, 'time': 1724206390.4189193, 'host_net_io': {'read_bps': 151288.7683047524, 'write_bps': 4695936.671300749}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 3623228.537837676}}, 'executing': {}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.921 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:10.979 | distributed.core | Message from 'tcp://127.0.0.1:52993': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:52968', 'now': 1724206390.955338, 'metrics': {'task_counts': {}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'tick-duration': 0.48592066764831543, 'latency': 0.01643824577331543}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 189, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 3306}, 'event_loop_interval': 0.02009161313374837, 'cpu': 6.2, 'memory': 673681408, 'time': 1724206390.934552, 'host_net_io': {'read_bps': 159527.44216730804, 'write_bps': 4941169.436311122}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 0.0}}, 'executing': {}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:10.979 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822
DEBUG 2024-08-20T22:13:11.014 | distributed.core | Message from 'tcp://127.0.0.1:54092': {'op': 'heartbeat_worker', 'address': 'tcp://127.0.0.1:54089', 'now': 1724206391.0100653, 'metrics': {'task_counts': {}, 'bandwidth': {'total': 100000000, 'workers': {}, 'types': {}}, 'digests_total_since_heartbeat': {'latency': 0.012102603912353516, 'tick-duration': 0.5027740001678467}, 'managed_bytes': 0, 'spilled_bytes': {'memory': 0, 'disk': 0}, 'transfer': {'incoming_bytes': 0, 'incoming_count': 0, 'incoming_count_total': 0, 'outgoing_bytes': 0, 'outgoing_count': 0, 'outgoing_count_total': 59}, 'event_loop_interval': 0.019717864990234375, 'cpu': 0.0, 'memory': 421269504, 'time': 1724206390.5082853, 'host_net_io': {'read_bps': 142699.76909649736, 'write_bps': 4534095.129038394}, 'host_disk_io': {'read_bps': 0.0, 'write_bps': 3786738.697029174}}, 'executing': {}, 'extensions': {'spans': {}, 'shuffle': {}}, 'reply': True} | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 760
DEBUG 2024-08-20T22:13:11.017 | distributed.core | Calling into handler heartbeat_worker | 2260 | MainProcess | IO loop | 13880 | core | _handle_comm | 822