How to pass credential to worker if it is changing every hour

I am facing a problem passing AWS credentials to workers.

I am using the complicated mechanism in client-side code, which requires authenticating with Kerberos, and It will provide me with the AWS credentials. When old AWS credentials expire, I have to re-authenticate with Kerberos again to get the credentials.

How should I pass AWS credentials from the parent script to the workers?

  1. I have to pass these credentials to workers when they get created
  2. Refersh these credentials to running workers

I have tried this code when a worker starts

def set_aws_credentials(aws_access_key_id, aws_secret_access_key, aws_session_token):
    import os
    os.environ["AWS_ACCESS_KEY_ID"] = aws_access_key_id
    os.environ["AWS_SECRET_ACCESS_KEY"] = aws_secret_access_key
    os.environ["AWS_SESSION_TOKEN"] = aws_session_token
    print(f"AWS credentials :{os.getenv('AWS_ACCESS_KEY_ID')}")

aws_access_key_id = os.getenv("AWS_ACCESS_KEY_ID")
aws_secret_access_key = os.getenv("AWS_SECRET_ACCESS_KEY")
aws_session_token = os.getenv("AWS_SESSION_TOKEN")
self.client.register_worker_callbacks(set_aws_credentials(aws_access_key_id, aws_secret_access_key, aws_session_token))

I have also tried passing these values as environment variables.

credentials = {
    "AWS_ACCESS_KEY_ID": os.getenv("AWS_ACCESS_KEY_ID"),
    "AWS_SECRET_ACCESS_KEY": os.getenv("AWS_SECRET_ACCESS_KEY"),
    "AWS_SESSION_TOKEN": os.getenv("AWS_SESSION_TOKEN")
}
env_credentials = [{"name": key, "value": value} for key, value in credentials.items()]
worker_spec = make_worker_spec(n_workers=0,env=env_credentials)
worker_spec["spec"]["nodeSelector"] = {"dc_profiler": "true"}
self.cluster.add_worker_group(name=self.cluster.name, custom_spec=worker_spec, env=env_credentials)

When workers start, and I see the definition of the pod using Kubectl, I do not see the environment variable.

spec:
  containers:
  - args:
    - dask-worker
    - --name
    - $(DASK_WORKER_NAME)
    - --dashboard
    - --dashboard-address
    - "8788"
    env:
    - name: DASK_WORKER_NAME
      value: tablestats-lpak-default-worker-8b4a38f202
    - name: DASK_SCHEDULER_ADDRESS
      value: tcp://tablestats-lpak-scheduler.dc-dataload-vr-cdl-2.svc.cluster.local:8786
    image: ghcr.io/dask/dask:2024.1.0-py3.11
    imagePullPolicy: IfNotPresent
    name: worker

Can you please help me with what I am doing wrong?

Hi @rvarunrathod,

First: how are you starting your Dask cluster? I’m not sure updating environment variables after starting them would be enough…

Did you manage to get it working at least without refreshing them upon expiration?

There are some description on how to do it with EC2Cluster (Amazon Web Services (AWS) — Dask Cloud Provider 0+untagged.50.gef21317 documentation), but not sure you are using it.

Perhaps an alternative would be to use libraries such as fsspec, which would store the credentials for you. What are you doing then with these credentials?

cc @jacobtomlinson @martindurant

Updating environment variables will not have any effect on a process that is already running. There should be a way to update a file, though, kube has facilities for that (but don’t ask me the syntax). The typical file for this is ~/.aws/credentials ; but I’m not certain that a running session will re-check this file, even when getting a credentials expired error. You may need to restart your workers every time you update the credentials. If you do this in a rolling fashion, then maybe dask will be able to to copy partial results around so yo don’t lose work.

1 Like