ValueError: bytes object is too large

Hello every!
When I use a python package designed for InSAR computation and displamcement calculation (array computations) I stumble across this error:

2024-05-28 14:19:55,803 - distributed.protocol.core - CRITICAL - Failed to Serialize
Traceback (most recent call last):
  File "/home/gebo/py_venv/cophil/lib/python3.10/site-packages/distributed/protocol/core.py", line 109, in dumps
    frames[0] = msgpack.dumps(msg, default=_encode_default, use_bin_type=True)
  File "/home/gebo/py_venv/cophil/lib/python3.10/site-packages/msgpack/__init__.py", line 36, in packb
    return Packer(**kwargs).pack(o)
  File "msgpack/_packer.pyx", line 294, in msgpack._cmsgpack.Packer.pack
  File "msgpack/_packer.pyx", line 300, in msgpack._cmsgpack.Packer.pack
  File "msgpack/_packer.pyx", line 297, in msgpack._cmsgpack.Packer.pack
  File "msgpack/_packer.pyx", line 264, in msgpack._cmsgpack.Packer._pack
  File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack
  File "msgpack/_packer.pyx", line 264, in msgpack._cmsgpack.Packer._pack
  File "msgpack/_packer.pyx", line 202, in msgpack._cmsgpack.Packer._pack
ValueError: bytes object is too large
2024-05-28 14:19:55,807 - distributed.comm.utils - ERROR - bytes object is too large
Traceback (most recent call last):
  File "/home/gebo/py_venv/cophil/lib/python3.10/site-packages/distributed/comm/utils.py", line 34, in _to_frames
    return list(protocol.dumps(msg, **kwargs))
  File "/home/gebo/py_venv/cophil/lib/python3.10/site-packages/distributed/protocol/core.py", line 109, in dumps
    frames[0] = msgpack.dumps(msg, default=_encode_default, use_bin_type=True)
  File "/home/gebo/py_venv/cophil/lib/python3.10/site-packages/msgpack/__init__.py", line 36, in packb
    return Packer(**kwargs).pack(o)
  File "msgpack/_packer.pyx", line 294, in msgpack._cmsgpack.Packer.pack
  File "msgpack/_packer.pyx", line 300, in msgpack._cmsgpack.Packer.pack
  File "msgpack/_packer.pyx", line 297, in msgpack._cmsgpack.Packer.pack
  File "msgpack/_packer.pyx", line 264, in msgpack._cmsgpack.Packer._pack
  File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack
  File "msgpack/_packer.pyx", line 264, in msgpack._cmsgpack.Packer._pack
  File "msgpack/_packer.pyx", line 202, in msgpack._cmsgpack.Packer._pack
ValueError: bytes object is too large
2024-05-28 14:19:55,811 - distributed.batched - ERROR - Error in batched write
Traceback (most recent call last):
  File "/home/gebo/py_venv/cophil/lib/python3.10/site-packages/distributed/batched.py", line 115, in _background_send
    nbytes = yield coro
  File "/home/gebo/py_venv/cophil/lib/python3.10/site-packages/tornado/gen.py", line 767, in run
    value = future.result()
  File "/home/gebo/py_venv/cophil/lib/python3.10/site-packages/distributed/comm/tcp.py", line 264, in write
    frames = await to_frames(
  File "/home/gebo/py_venv/cophil/lib/python3.10/site-packages/distributed/comm/utils.py", line 48, in to_frames
    return await offload(_to_frames)
  File "/home/gebo/py_venv/cophil/lib/python3.10/site-packages/distributed/utils.py", line 1540, in run_in_executor_with_context
    return await loop.run_in_executor(
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/gebo/py_venv/cophil/lib/python3.10/site-packages/distributed/utils.py", line 1541, in <lambda>
    executor, lambda: context.run(func, *args, **kwargs)
  File "/home/gebo/py_venv/cophil/lib/python3.10/site-packages/distributed/comm/utils.py", line 34, in _to_frames
    return list(protocol.dumps(msg, **kwargs))
  File "/home/gebo/py_venv/cophil/lib/python3.10/site-packages/distributed/protocol/core.py", line 109, in dumps
    frames[0] = msgpack.dumps(msg, default=_encode_default, use_bin_type=True)
  File "/home/gebo/py_venv/cophil/lib/python3.10/site-packages/msgpack/__init__.py", line 36, in packb
    return Packer(**kwargs).pack(o)
  File "msgpack/_packer.pyx", line 294, in msgpack._cmsgpack.Packer.pack
  File "msgpack/_packer.pyx", line 300, in msgpack._cmsgpack.Packer.pack
  File "msgpack/_packer.pyx", line 297, in msgpack._cmsgpack.Packer.pack
  File "msgpack/_packer.pyx", line 264, in msgpack._cmsgpack.Packer._pack
  File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack
  File "msgpack/_packer.pyx", line 264, in msgpack._cmsgpack.Packer._pack
  File "msgpack/_packer.pyx", line 202, in msgpack._cmsgpack.Packer._pack
ValueError: bytes object is too large

I am not able to debug this because i dont have a certain error to traceback, and all of those are related to dask.

I have the next dask version: 2024.5.1. Anyone has encountered this and has any idea of what can cause this behavior?

Hi @GB1995, welcome to Dask Community,

There are several topics discussing this, and a github issue.

The general cause for this is that you try to send a big object (> 4GiB) though the tasks graph, which is not possible and really not recommended. Do you have any warning before, like

UserWarning: Large object of size 4.01 GiB detected in task graph

?

How are you reading your data and launching your computations?