Dask uploading local code to remote workers

I’m trying to run some local code on using an EC2Cluster and I’ve made the modules available:

        ec2_cluster = EC2Cluster(**aws_cfg)
        client = Client(ec2_cluster)
        client.upload_file("aws_errors.py")
        client.upload_file("aws_utils.py")
        client.upload_file("dask_errors.py")
        client.upload_file("dask_task.py")

The problem however is I seem to run into the following exception and I can’t determine why?

C:\Python3.7\lib\site-packages\distributed\client.py:1265: VersionMismatchWarning: Mismatched versions found

+-------------+-----------+-----------+---------+
| Package     | client    | scheduler | workers |
+-------------+-----------+-----------+---------+
| blosc       | None      | 1.10.2    | None    |
| dask        | 2022.02.0 | 2022.02.1 | None    |
| distributed | 2022.02.0 | 2022.2.1  | None    |
| lz4         | None      | 3.1.10    | None    |
+-------------+-----------+-----------+---------+
  warnings.warn(version_module.VersionMismatchWarning(msg[0]["warning"]))
Traceback (most recent call last):
  File "C:\repo\ampersan\aws_utils\python\soothsayer\soothsayer\dask_task.py", line 135, in run_dask_ec2
    return result.result()
  File "C:\Python3.7\lib\site-packages\distributed\client.py", line 275, in result
    raise exc.with_traceback(tb)
  File "/opt/conda/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 66, in loads
TypeError: 'bytes' object cannot be interpreted as an integer

Thanks in advance

Thanks for the question @davico888!

First, though the VersionMismatchWarning is just a warning, you should be able to fix this by upgrading the packages listed in the warning message to the recommended versions.

For the TypeError: 'bytes' object cannot be interpreted as an integer, does this happen when you load dask_task.py to the client or when you try to run it? Are you able to reproduce the problem using a local cluster? Sharing a minimally reproducible example would also help in being able to diagnose this!

2 Likes

I found the solution to this. It seems the python versions need to be consistent across all your servers.
Eg. my local server was running python3.8 and remote scheduler and worker were on python3.10

2 Likes