Error on invoking client in Azure Kubernetes Service cluster

Hi Team,

I am getting the following error on invoking submit using client Object. Any help will be highly appreciated. Also adding the scheduler side log at the bottom.

root@python:~# python3
Python 3.10.12 (main, Aug 16 2023, 20:13:22) [GCC 12.2.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.

from dask.distributed import Client
client = Client(“tcp://10.0.161.177:8786”)
/usr/local/lib/python3.10/site-packages/distributed/client.py:1470: VersionMismatchWarning: Mismatched versions found

±------------±---------±----------±---------+
| Package | Client | Scheduler | Workers |
±------------±---------±----------±---------+
| dask | 2024.7.0 | 2024.1.0 | 2024.1.0 |
| distributed | 2024.7.0 | 2024.1.0 | 2024.1.0 |
| lz4 | None | 4.3.3 | 4.3.3 |
| msgpack | 1.0.8 | 1.0.7 | 1.0.7 |
| numpy | None | 1.26.3 | 1.26.3 |
| pandas | None | 2.1.4 | 2.1.4 |
| toolz | 0.12.1 | 0.12.0 | 0.12.0 |
| tornado | 6.4.1 | 6.3.3 | 6.3.3 |
±------------±---------±----------±---------+
warnings.warn(version_module.VersionMismatchWarning(msg[0][“warning”]))

def square(x):
… return x ** 2

def neg(x):
… return -x

A = client.map(square, range(10))
B = client.map(neg, A)
total = client.submit(sum, B)
total.result()
Traceback (most recent call last):
File “”, line 1, in
File “/usr/local/lib/python3.10/site-packages/distributed/client.py”, line 392, in result
return self.client.sync(self._result, callback_timeout=timeout)
File “/usr/local/lib/python3.10/site-packages/distributed/client.py”, line 408, in _result
raise exception
distributed.client.FutureCancelledError: sum-008fed9aa78f3dcc5601049e13d83c3f cancelled for reason: scheduler-connection-lost.
Client lost the connection to the scheduler. Please check your connection and re-run your work.


Scheduler log:
2024-07-10 15:56:31,897 - distributed.scheduler - INFO - State start
2024-07-10 15:56:31,899 - distributed.scheduler - INFO - -----------------------------------------------
2024-07-10 15:56:31,900 - distributed.scheduler - INFO - Scheduler at: tcp://10.224.0.131:8786
2024-07-10 15:56:31,900 - distributed.scheduler - INFO - dashboard at: http://10.224.0.131:8787/status
2024-07-10 15:56:31,900 - distributed.scheduler - INFO - Registering Worker plugin shuffle
2024-07-10 15:56:33,179 - distributed.scheduler - INFO - Register worker <WorkerState ‘tcp://10.224.0.128:35095’, status: init, memory: 0, processing: 0>
2024-07-10 15:56:34,022 - distributed.scheduler - INFO - Starting worker compute stream, tcp://10.224.0.128:35095
2024-07-10 15:56:34,022 - distributed.core - INFO - Starting established connection to tcp://10.224.0.128:47476
2024-07-10 15:56:34,023 - distributed.scheduler - INFO - Register worker <WorkerState ‘tcp://10.224.0.104:39309’, status: init, memory: 0, processing: 0>
2024-07-10 15:56:34,023 - distributed.scheduler - INFO - Starting worker compute stream, tcp://10.224.0.104:39309
2024-07-10 15:56:34,023 - distributed.core - INFO - Starting established connection to tcp://10.224.0.104:49944
2024-07-10 15:56:34,024 - distributed.scheduler - INFO - Register worker <WorkerState ‘tcp://10.224.1.64:39071’, status: init, memory: 0, processing: 0>
2024-07-10 15:56:34,024 - distributed.scheduler - INFO - Starting worker compute stream, tcp://10.224.1.64:39071
2024-07-10 15:56:34,024 - distributed.core - INFO - Starting established connection to tcp://10.224.1.64:40672
2024-07-10 16:05:38,634 - distributed.scheduler - INFO - Receive client connection: Client-3fce7dc0-3ed6-11ef-801d-12f3cc8d443a
2024-07-10 16:05:38,634 - distributed.core - INFO - Starting established connection to tcp://10.224.1.18:41920
2024-07-10 16:06:07,693 - distributed.scheduler - INFO - Remove client Client-3fce7dc0-3ed6-11ef-801d-12f3cc8d443a
2024-07-10 16:06:07,693 - distributed.scheduler - INFO - Close client connection: Client-3fce7dc0-3ed6-11ef-801d-12f3cc8d443a
2024-07-10 16:06:07,693 - distributed.core - ERROR - Exception while handling op register-client
Traceback (most recent call last):
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 832, in wrapper
return await func(*args, **kwargs)
TypeError: Scheduler.update_graph() got an unexpected keyword argument ‘span_metadata’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/opt/conda/lib/python3.10/site-packages/distributed/core.py”, line 969, in _handle_comm
result = await result
File “/opt/conda/lib/python3.10/site-packages/distributed/scheduler.py”, line 5602, in add_client
await self.handle_stream(comm=comm, extra={“client”: client})
File “/opt/conda/lib/python3.10/site-packages/distributed/core.py”, line 1050, in handle_stream
await handler(**merge(extra, msg))
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 831, in wrapper
with self:
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 852, in exit
frame = stack[self.unroll_stack]
IndexError: list index out of range
Task exception was never retrieved
future: <Task finished name=‘Task-41441’ coro=<Server._handle_comm() done, defined at /opt/conda/lib/python3.10/site-packages/distributed/core.py:875> exception=IndexError(‘list index out of range’)>
Traceback (most recent call last):
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 832, in wrapper
return await func(*args, **kwargs)
TypeError: Scheduler.update_graph() got an unexpected keyword argument ‘span_metadata’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/opt/conda/lib/python3.10/site-packages/distributed/core.py”, line 969, in _handle_comm
result = await result
File “/opt/conda/lib/python3.10/site-packages/distributed/scheduler.py”, line 5602, in add_client
await self.handle_stream(comm=comm, extra={“client”: client})
File “/opt/conda/lib/python3.10/site-packages/distributed/core.py”, line 1050, in handle_stream
await handler(**merge(extra, msg))
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 831, in wrapper
with self:
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 852, in exit
frame = stack[self.unroll_stack]
IndexError: list index out of range
2024-07-10 16:06:07,696 - distributed.scheduler - INFO - Receive client connection: Client-3fce7dc0-3ed6-11ef-801d-12f3cc8d443a
2024-07-10 16:06:07,697 - distributed.core - INFO - Starting established connection to tcp://10.224.1.18:54100
2024-07-10 16:06:14,932 - distributed.scheduler - INFO - Remove client Client-3fce7dc0-3ed6-11ef-801d-12f3cc8d443a
2024-07-10 16:06:14,932 - distributed.scheduler - INFO - Close client connection: Client-3fce7dc0-3ed6-11ef-801d-12f3cc8d443a
2024-07-10 16:06:14,932 - distributed.core - ERROR - Exception while handling op register-client
Traceback (most recent call last):
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 832, in wrapper
return await func(*args, **kwargs)
TypeError: Scheduler.update_graph() got an unexpected keyword argument ‘span_metadata’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/opt/conda/lib/python3.10/site-packages/distributed/core.py”, line 969, in _handle_comm
result = await result
File “/opt/conda/lib/python3.10/site-packages/distributed/scheduler.py”, line 5602, in add_client
await self.handle_stream(comm=comm, extra={“client”: client})
File “/opt/conda/lib/python3.10/site-packages/distributed/core.py”, line 1050, in handle_stream
await handler(**merge(extra, msg))
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 831, in wrapper
with self:
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 852, in exit
frame = stack[self.unroll_stack]
IndexError: list index out of range
Task exception was never retrieved
future: <Task finished name=‘Task-43652’ coro=<Server._handle_comm() done, defined at /opt/conda/lib/python3.10/site-packages/distributed/core.py:875> exception=IndexError(‘list index out of range’)>
Traceback (most recent call last):
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 832, in wrapper
return await func(*args, **kwargs)
TypeError: Scheduler.update_graph() got an unexpected keyword argument ‘span_metadata’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/opt/conda/lib/python3.10/site-packages/distributed/core.py”, line 969, in _handle_comm
result = await result
File “/opt/conda/lib/python3.10/site-packages/distributed/scheduler.py”, line 5602, in add_client
await self.handle_stream(comm=comm, extra={“client”: client})
File “/opt/conda/lib/python3.10/site-packages/distributed/core.py”, line 1050, in handle_stream
await handler(**merge(extra, msg))
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 831, in wrapper
with self:
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 852, in exit
frame = stack[self.unroll_stack]
IndexError: list index out of range
2024-07-10 16:06:14,936 - distributed.scheduler - INFO - Receive client connection: Client-3fce7dc0-3ed6-11ef-801d-12f3cc8d443a
2024-07-10 16:06:14,936 - distributed.core - INFO - Starting established connection to tcp://10.224.1.18:48326
2024-07-10 16:06:25,839 - distributed.scheduler - INFO - Remove client Client-3fce7dc0-3ed6-11ef-801d-12f3cc8d443a
2024-07-10 16:06:25,840 - distributed.scheduler - INFO - Close client connection: Client-3fce7dc0-3ed6-11ef-801d-12f3cc8d443a
2024-07-10 16:06:25,840 - distributed.core - ERROR - Exception while handling op register-client
Traceback (most recent call last):
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 832, in wrapper
return await func(*args, **kwargs)
TypeError: Scheduler.update_graph() got an unexpected keyword argument ‘span_metadata’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/opt/conda/lib/python3.10/site-packages/distributed/core.py”, line 969, in _handle_comm
result = await result
File “/opt/conda/lib/python3.10/site-packages/distributed/scheduler.py”, line 5602, in add_client
await self.handle_stream(comm=comm, extra={“client”: client})
File “/opt/conda/lib/python3.10/site-packages/distributed/core.py”, line 1050, in handle_stream
await handler(**merge(extra, msg))
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 831, in wrapper
with self:
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 852, in exit
frame = stack[self.unroll_stack]
IndexError: list index out of range
Task exception was never retrieved
future: <Task finished name=‘Task-44208’ coro=<Server._handle_comm() done, defined at /opt/conda/lib/python3.10/site-packages/distributed/core.py:875> exception=IndexError(‘list index out of range’)>
Traceback (most recent call last):
File “/opt/conda/lib/python3.10/site-packages/distributed/utils.py”, line 832, in wrapper
return await func(*args, **kwargs)
TypeError: Scheduler.update_graph() got an unexpected keyword argument ‘span_metadata’

The version mismatch at the top of the error is telling you that the version of Dask you have installed on your cluster does not match the version of Dask you have installed locally.

I expect the error is happening because of this.

I would recommend you either upgrade the version on your cluster, or downgrade the version on your client.

2 Likes