Running on UChicago and was waiting for compute() with an HTCondorCluster running:
cluster = HTCondorCluster(log_directory="path/to/log/", cores=5, memory="20GB", disk="5GB")
output = [ ]
for i in loop:
output.append(dask.delayed(function)(parameters[i]))
cluster.scale(jobs=len(output))
client = Client(cluster)
dask.compute(*output)
I’m seeing a few jobs get submitted, but they look to fail and after a few minutes I got this error:
/cvmfs/sft.cern.ch/lcg/releases/LCG_105/distributed/2023.7.1/x86_64-el9-gcc13-opt/lib/python3.9/site-packages/distributed/node.py:182: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 24469 instead
warnings.warn(
ConnectionRefusedError: [Errno 111] Connection refused
What could be the cause of this / what steps can I take to diagnose what is happening?