I am trying to deploy Dask with only Namespace permissions (without permissions to the whole cluster, meaning ClusterRole would be interpreted as Role). However, Gateway needs permissions to the whole cluster to work with DaskCluster resources. Is there a way to configure Gateway and DaskCluster resources that they would not require cluster permissions?
This is the output of Gateway pod logs:
[E 2023-03-01 13:40:58.197 DaskGateway] Error in cluster informer, retrying...
Traceback (most recent call last):
File "/home/dask/.local/lib/python3.11/site-packages/dask_gateway_server/backends/kubernetes/utils.py", line 149, in run
initial = await method(**self.method_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/dask/.local/lib/python3.11/site-packages/kubernetes_asyncio/client/api_client.py", line 192, in __call_api
raise e
File "/home/dask/.local/lib/python3.11/site-packages/kubernetes_asyncio/client/api_client.py", line 185, in __call_api
response_data = await self.request(
^^^^^^^^^^^^^^^^^^^
File "/home/dask/.local/lib/python3.11/site-packages/kubernetes_asyncio/client/rest.py", line 193, in GET
return (await self.request("GET", url,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/dask/.local/lib/python3.11/site-packages/kubernetes_asyncio/client/rest.py", line 187, in request
raise ApiException(http_resp=r)
kubernetes_asyncio.client.exceptions.ApiException: (403)
Reason: Forbidden
HTTP response headers: <CIMultiDictProxy('Audit-Id': 'cfdd8eea-2a1c-4352-89a7-ae575db88302', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '9be32bf1-fb16-42ed-ba4b-08e870d56884', 'X-Kubernetes-Pf-Prioritylevel-Uid': '65342b8c-d58f-47e4-b499-73e62bb393da', 'Date': 'Wed, 01 Mar 2023 13:40:58 GMT', 'Content-Length': '451')>
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"daskclusters.gateway.dask.org is forbidden: User \"system:serviceaccount:visnovsky1-ns:api-dhub-dask-gateway\" cannot list resource \"daskclusters\" in API group \"gateway.dask.org\" at the cluster scope: RBAC: clusterrole.rbac.authorization.k8s.io \"fleet-content\" not found","reason":"Forbidden","details":{"group":"gateway.dask.org","kind":"daskclusters"},"code":403}
Could you please provide the configuration you use to deploy dask-gateway (values.yaml, command)? Did you use directly dask-gateway or daskhub helm chart?
Also pinging @jacobtomlinson as this looks like it needs deep Kubernetes knowledge.
dask-gateway:
enabled: true
gateway:
auth:
jupyterhub:
apiToken: b2ff1f8f489ad89469d6b92272441a4a43a8713110e766da28ad74e9c84d8fb3 # replace this
apiUrl: http://proxy-public/hub/api
type: jupyterhub
livenessProbe:
# Enables the livenessProbe.
enabled: true
# Configures the livenessProbe.
initialDelaySeconds: 5
timeoutSeconds: 2
periodSeconds: 10
failureThreshold: 6
readinessProbe:
# Enables the readinessProbe.
enabled: true
# Configures the readinessProbe.
initialDelaySeconds: 5
timeoutSeconds: 2
periodSeconds: 10
failureThreshold: 3
extraConfig:
dasklimits: |
c.ClusterConfig.cluster_max_cores = 6
c.ClusterConfig.cluster_max_memory = "24 G"
c.ClusterConfig.cluster_max_workers = 4
c.ClusterConfig.idle_timeout = 1800
optionHandler: |
from dask_gateway_server.options import Options, Integer, Float, String
def options_handler(options):
if ":" not in options.image:
raise ValueError("When specifying an image you must also provide a tag")
return {
"worker_cores": options.worker_cores,
"worker_memory": int(options.worker_memory * 2 ** 30),
"image": options.image,
}
c.Backend.cluster_options = Options(
Integer("worker_cores", default=1, min=1, max=4, label="Worker Cores"),
Float("worker_memory", default=2, min=2, max=8, label="Worker Memory (GiB)"),
String("image", default="pangeo/pangeo-notebook:2022.09.21", label="Image"),
handler=options_handler,
)
prefix: /services/dask-gateway
backend:
scheduler:
extraContainerConfig:
securityContext:
runAsUser: 1000
runAsGroup: 1000
worker:
extraContainerConfig:
securityContext:
runAsUser: 1000
runAsGroup: 1000
traefik:
service:
type: ClusterIP
dask-kubernetes:
enabled: false
rbac:
enabled: true
jupyterhub:
hub:
config:
Authenticator:
admin_users:
- admin
JupyterHub:
admin_access: true
authenticator_class: nativeauthenticator.NativeAuthenticator
services:
dask-gateway:
apiToken: b2ff1f8f489ad89469d6b92272441a4a43a8713110e766da28ad74e9c84d8fb3 # replace this
networkPolicy:
interNamespaceAccessLabels: "accept"
ingress:
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: "letsencrypt-prod"
enabled: true
hosts:
- daskgateway.dyn.cloud.e-infra.cz # replace this with your DNS name
tls:
- hosts:
- daskgateway.dyn.cloud.e-infra.cz # replace this with your DNS name
secretName: daskgateway-dyn-cloud-e-infra-cz # replace this with your DNS name
proxy:
secretToken: 04bc4720901afa229c7c6832592c2d53bd17e45035fe5bbc23693c785b0c3913 # replace this
service:
type: ClusterIP
singleuser:
networkPolicy:
enabled: false
cloudMetadata:
blockWithIptables: false
cpu:
guarantee: 1
limit: 2
defaultUrl: /lab
extraEnv:
DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE: '{JUPYTER_IMAGE_SPEC}'
image:
name: pangeo/pangeo-notebook
tag: 2022.09.21
memory:
guarantee: 2G
limit: 4G
startTimeout: 600
storage:
capacity: 2Gi
extraPodConfig:
securityContext:
fsGroupChangePolicy: OnRootMismatch
scheduling:
userScheduler:
enabled: false
prePuller:
hook:
enabled: false
continuous:
enabled: false
Command that I deploy Dask with.
I am using downloaded helm repo (helm pull daskhub --repo=https://helm.dask.org) where I change everywhere “ClusterRole → Role” and “ClusterRoleBinding → RoleBinding” in rbac.yaml files.
gateway/rbac.yaml (the same for controller/rbac.yaml and traefik/rbac.yaml):
{{- if .Values.rbac.enabled -}}
{{- if not .Values.rbac.gateway.serviceAccountName -}}
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "dask-gateway.apiName" . }}
labels:
{{- include "dask-gateway.labels" . | nindent 4 }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role #ClusterRole
metadata:
name: {{ include "dask-gateway.apiName" . }}
labels:
{{- include "dask-gateway.labels" . | nindent 4 }}
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get"]
- apiGroups: ["gateway.dask.org"]
resources: ["daskclusters"]
verbs: ["*"]
---
kind: RoleBinding #ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: {{ include "dask-gateway.apiName" . }}
labels:
{{- include "dask-gateway.labels" . | nindent 4 }}
subjects:
- kind: ServiceAccount
name: {{ include "dask-gateway.apiName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
kind: Role #ClusterRole
name: {{ include "dask-gateway.apiName" . }}
apiGroup: rbac.authorization.k8s.io
{{- end }}
{{- end }}
I’m not 100% sure but I thought Dask Gateway could only be installed at the cluster level with a cluster role.
If you’re just trying to run Dask clusters on Kubernetes in your own namespace and you have permissions to use kubectl I recommend you look at dask-kubernetes.
Are you trying to deploy Dask as a service for your team or just yourself?
If all of the users have access to the Kubernetes API then Dask-kubernetes is probably your best choice anyway. If you’re trying to totally abstract Kubernetes away then Gateway is your best choice. So it depends what you’re trying to do.
@VladimirVisnovsky did you ever get this to work? I’m in the same situation where I want to deploy in one namespace and only am allowed to use Roles/Rolebindings not ClusterRoles/ClusterRoleBindings.