Hello everyone!
I am currently trying to utilize dask so I can run multiple instances on different machines on demand. I am used to working with docker containers so I have launched a scheduler on our Synology NAS and four workers on two servers (two workers per machine, using --nworkers 2
). I am then able to connect to the scheduler and see information about the workers. I execute a command, like in the example x = client.submit(inc, 10)
, but when trying to execute x.result()
the worker crashes with the message:
dask-worker | 2022-06-30 15:30:54,139 - distributed.worker - INFO - -------------------------------------------------
dask-worker | 2022-06-30 15:30:54,139 - distributed.core - INFO - Starting established connection
dask-worker | 2022-06-30 15:30:54,141 - distributed.core - INFO - Starting established connection
dask-worker | 2022-06-30 15:33:22,947 - distributed.worker - INFO - Stopping worker at tcp://172.28.0.2:33475
dask-worker | 2022-06-30 15:33:22,954 - distributed.worker - INFO - Connection to scheduler broken. Closing without reporting. ID: Worker-aea8ec1a-3aae-4fb1-a0f4-a3d986cd9590 Address tcp://172.28.0.2:33475 Status: Status.closing
dask-worker | 2022-06-30 15:33:22,956 - distributed.nanny - INFO - Worker closed
dask-worker | 2022-06-30 15:33:22,956 - distributed.nanny - ERROR - Worker process died unexpectedly
dask-worker | 2022-06-30 15:33:23,184 - distributed.nanny - INFO - Closing Nanny at 'tcp://172.28.0.2:41050'.
dask-worker | 2022-06-30 15:33:52,954 - distributed.worker - INFO - Stopping worker at tcp://172.28.0.2:38053
dask-worker | 2022-06-30 15:33:52,960 - distributed.worker - INFO - Connection to scheduler broken. Closing without reporting. ID: Worker-205b67de-2e43-4a8a-82af-4d2dc01e3a4a Address tcp://172.28.0.2:38053 Status: Status.closing
dask-worker | 2022-06-30 15:33:52,962 - distributed.nanny - INFO - Worker closed
dask-worker | 2022-06-30 15:33:52,963 - distributed.nanny - ERROR - Worker process died unexpectedly
dask-worker | 2022-06-30 15:33:53,209 - distributed.nanny - INFO - Closing Nanny at 'tcp://172.28.0.2:33479'.
dask-worker | 2022-06-30 15:33:53,211 - distributed.dask_worker - INFO - End worker
I have been following the example from the site, except for having the containers on the same network. Would that be essential to solving my problem? And would I need to use docker swarm to connect the machines or is there a workaround?