I am facing a problem passing AWS credentials to workers.
I am using the complicated mechanism in client-side code, which requires authenticating with Kerberos, and It will provide me with the AWS credentials. When old AWS credentials expire, I have to re-authenticate with Kerberos again to get the credentials.
How should I pass AWS credentials from the parent script to the workers?
- I have to pass these credentials to workers when they get created
- Refersh these credentials to running workers
I have tried this code when a worker starts
def set_aws_credentials(aws_access_key_id, aws_secret_access_key, aws_session_token):
import os
os.environ["AWS_ACCESS_KEY_ID"] = aws_access_key_id
os.environ["AWS_SECRET_ACCESS_KEY"] = aws_secret_access_key
os.environ["AWS_SESSION_TOKEN"] = aws_session_token
print(f"AWS credentials :{os.getenv('AWS_ACCESS_KEY_ID')}")
aws_access_key_id = os.getenv("AWS_ACCESS_KEY_ID")
aws_secret_access_key = os.getenv("AWS_SECRET_ACCESS_KEY")
aws_session_token = os.getenv("AWS_SESSION_TOKEN")
self.client.register_worker_callbacks(set_aws_credentials(aws_access_key_id, aws_secret_access_key, aws_session_token))
I have also tried passing these values as environment variables.
credentials = {
"AWS_ACCESS_KEY_ID": os.getenv("AWS_ACCESS_KEY_ID"),
"AWS_SECRET_ACCESS_KEY": os.getenv("AWS_SECRET_ACCESS_KEY"),
"AWS_SESSION_TOKEN": os.getenv("AWS_SESSION_TOKEN")
}
env_credentials = [{"name": key, "value": value} for key, value in credentials.items()]
worker_spec = make_worker_spec(n_workers=0,env=env_credentials)
worker_spec["spec"]["nodeSelector"] = {"dc_profiler": "true"}
self.cluster.add_worker_group(name=self.cluster.name, custom_spec=worker_spec, env=env_credentials)
When workers start, and I see the definition of the pod using Kubectl, I do not see the environment variable.
spec:
containers:
- args:
- dask-worker
- --name
- $(DASK_WORKER_NAME)
- --dashboard
- --dashboard-address
- "8788"
env:
- name: DASK_WORKER_NAME
value: tablestats-lpak-default-worker-8b4a38f202
- name: DASK_SCHEDULER_ADDRESS
value: tcp://tablestats-lpak-scheduler.dc-dataload-vr-cdl-2.svc.cluster.local:8786
image: ghcr.io/dask/dask:2024.1.0-py3.11
imagePullPolicy: IfNotPresent
name: worker
Can you please help me with what I am doing wrong?