Can't connect to local cluster - times out

New to Dask distributed. I’ve created a local cluster (default settings) and confirmed that it’s running by checking the dashboard. But when I try to create a client with client = Client('127.0.0.1:8787') it almost always times out. If I connect with the default client = Client() I get a warning "UserWarning: Port 8787 is already in use." Furthermore if I ty to call the default client from XGBoost

output = xgb.dask.train(
    client, params, dtrain, num_boost_round=5,
    evals=[(dtrain, 'train')]
)

I get OSError: [Errno 49] Can't assign requested address

Hi @GDB-SF, welcome! 8787 is the default port for the dashboard, but not for the scheduler to talk to clients (that’s 8786). And if either of those are occupied (e.g., if you have another python session lying around somewhere with another cluster running) then you’ll get the user warning, and another port will be chosen.

If you are creating local clusters, I’d recommend passing the cluster instance into the client directly, then you don’t need to worry about getting the port right:

import distributed

cluster = distributed.LocalCluster()  # could customize with different kwargs
client = distributed.Client(cluster)

Hi Ian. Wow - thanks for the quick turnaround! My use case is this: I’m weaning my data science team off Pandas and CSV as we begin working with larger and larger datasets. My intro is - surprise - a Jupyter notebook. The intro uses XGBoost and I can expect them to rerun the notebook as they tinker with the data etc. So I do this:

First, check to see if the cluster already exists: [View cluster status](http://localhost:8787/status) If the cluster does not exist, uncomment the next cell to create one

Then in the next cell I have

# cluster = LocalCluster()
# client = Client(cluster)

The best alternative is for them to be able to connect to a running cluster once they create it. Alternatively I can just add an earlier cell with client.shutdown() I suspect it will seem odd to them to shut down a cluster and recreate it every time they want to run the notebook.

Going forward Ian what I’m going to do is to dig into the documentation and maybe create another notebook where the user can set up the local cluster with sufficient specificity to allow them to call it from other notebooks. So:

  1. Start with your model training notebook. Check whether the local cluster is activated
  2. If not, then go to the ‘setup’ notebook and create it
  3. Go back to your model training notebook and run your code.

Make sense?

1 Like

Hi @GDB-SF - building on Ian’s example, you could use try/except to check for a running local cluster before creating a new one

from distributed import Client, LocalCluster

try:
    client = Client('tcp://localhost:8786', timeout='2s')
except OSError:
    cluster = LocalCluster(scheduler_port=8786)
    client = Client(cluster)
client

One clarifying question - would you ultimately want your data science to work on a shared cluster, e.g., not just local clusters running on their individual workstations?

@GDB-SF You might also be interested in trying Dask’s JupyterLab extension: this provides an (optional) graphical UI for launching a cluster which can survive a specific notebook kernel session, as well as makes it possible to embed Dask’s dashboard panes within the notebook environment.

This video shows some of the interactions that are possible:

So a user can create one cluster with this, and connect to it many times across different notebooks, while maintaining the same dashboard layout.

1 Like