Consider an SSHCluster hosted within a University - a LAN has been created and it doesn’t have access to the internet by default. I would like to connect the main node, which is being used also as a client and scheduler, remotely (e.g. Teamviewer). I only need the internet for remote access, otherwise the nodes will talk to each other via SSH through the LAN. These nodes have a single PCI ethernet card and it was being used predominantly for the LAN. I attempted to use the PCI port to connect the client node to the University’s network; and used an ethernet to USB adapter to also connect the node to the LAN.
The University’s network IPV4 configuration is set to Automatic DHCP, and it automatically obtains an IP address. The LAN network’s IPV4 configuration on the client (and all the other nodes in the cluster) are set to manual with static IPs 192.168.1.2X, are all on the same subnet mask, and were initially assigned a Default gateway of 192.168.1.1.
Both network interfaces are showing in the Network settings and can be enabled/disabled.
However, by default, when the 2 networks are turned on together, they don’t work. If one is turned off, the other one works.
I had to remove the default gateway from the LAN network interface to work with both enabled concurrently. Then I could browse the internet, as well as connect from the client to the worker nodes’ hosts via SSH. However, when I tried Dask, it seems that the scheduler, which I am always assigning as “localhost”, is automatically changing to an IP address, which seems to be related to the default gateway of the university network, e.g. tcp://10.6X.16.15X:38XX. And then it fails to start and connect to all the other workers, whose IP does not seem to be changing, but they are not recognized regardless. When I re-instate the Default gateway and disable the University network, Dask works, once again, as intended.
I read this link which seems to suggest that
In many cluster managers the default option is to expose the Dask scheduler and dashboard to the internet via a public IP address.
However, it shows an example that is only relevant to the Dask Cloudprovider.
Is this what is happening in my case? Is there an equivalent fix, to disable this, as mentioned for Dask Cloudprovider for Dask SSH Cluster? In any case, what is the recommended method to achieve what I want (dual network configuration)?
I am using Dask Distributed version 2023.8.1 on Ubuntu 22.04.3.