Get host:port of additional worker groups

ljstrnadiii · August 8, 2022, 5:52pm

I am currently deploying a dask cluster with multiple worker groups with helm. I deploy as
explained here: How to run different worker types with the Dask Helm Chart

I would like to determine which workers correspond to the gpu-worker group.

Is there a simple way to do this? @jacobtomlinson maybe you have some insight here? I can use annotations to determine which workers to run things on, but I would like to explicity handle the list of tcp://host:port associated with the additional worker group.

Then I could do something like:

client.run(gpu_task, workers=gpu_dask_workers)

How can I specify gpu_dask_workers?

jacobtomlinson · August 17, 2022, 10:08am

I think resources is what you’re looking for.

https://distributed.dask.org/en/stable/resources.html

ljstrnadiii · August 24, 2022, 6:27pm

Ok, this is what I have tried and no luck with any approach:

def get_name():
    worker = get_worker()
    return worker.name

# doesn't work because only the first deployed worker gets the name with helm
# when I use the extraArgs and try to set the name
names = client.run(get_name)

def get_ip():
    worker = get_worker()
    return str(worker.ip)

# doesn't work because I am at the mercy of how the task was submitted
with dask.annotate(resources={'additional_units': 1}):
    r = client.submit(get_ip)
    
# doesn't work because run does not take annotations
with dask.annotate(resources={'additional_units': 1}):
    r = client.run(get_ip)
    
# hacky, but still at the mercy of dask scheduling. All tasks gets scheduled to a single machine
futures = []
for i in range(100):
    with dask.annotate(resources={'additional_units': 1}):
        futures.append(client.submit(get_ip))

ips = set(client.gather(futures))

Am I missing something @jacobtomlinson? I can’t see how to discover the host:port of the additional workers either using annotations with name or resources and can’t think of another way to leverage resources.

jacobtomlinson · August 25, 2022, 10:26am

If you are using annotations you shouldn’t need to specify which worker to send it to. Everything in the context manager should be constrained to the workers with those resources.

Topic		Replies	Views
Additional worker deployments using `KubeCluster` Deploying Dask	1	299	May 12, 2022
Deploying Multiple Worker Types with Helm Deploying Dask	3	352	February 21, 2022
Installing Dask Workers on Partcular Node Pool Distributed dask-gateway , distributed	3	368	January 31, 2024
Tuning Distributed Dask Clusters with GPUs Distributed dask-gateway , distributed	3	994	February 21, 2022
Set up local cluster with custom resource assignments Distributed distributed	3	308	April 8, 2022

Get host:port of additional worker groups

Related topics