Deploying Dask without cluster permissions (only with namespace permissions)

I am trying to deploy Dask with only Namespace permissions (without permissions to the whole cluster, meaning ClusterRole would be interpreted as Role). However, Gateway needs permissions to the whole cluster to work with DaskCluster resources. Is there a way to configure Gateway and DaskCluster resources that they would not require cluster permissions?

This is the output of Gateway pod logs:

[E 2023-03-01 13:40:58.197 DaskGateway] Error in cluster informer, retrying...
Traceback (most recent call last):
  File "/home/dask/.local/lib/python3.11/site-packages/dask_gateway_server/backends/kubernetes/utils.py", line 149, in run
    initial = await method(**self.method_kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dask/.local/lib/python3.11/site-packages/kubernetes_asyncio/client/api_client.py", line 192, in __call_api
    raise e
  File "/home/dask/.local/lib/python3.11/site-packages/kubernetes_asyncio/client/api_client.py", line 185, in __call_api
    response_data = await self.request(
                    ^^^^^^^^^^^^^^^^^^^
  File "/home/dask/.local/lib/python3.11/site-packages/kubernetes_asyncio/client/rest.py", line 193, in GET
    return (await self.request("GET", url,
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dask/.local/lib/python3.11/site-packages/kubernetes_asyncio/client/rest.py", line 187, in request
    raise ApiException(http_resp=r)
kubernetes_asyncio.client.exceptions.ApiException: (403)
Reason: Forbidden
HTTP response headers: <CIMultiDictProxy('Audit-Id': 'cfdd8eea-2a1c-4352-89a7-ae575db88302', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '9be32bf1-fb16-42ed-ba4b-08e870d56884', 'X-Kubernetes-Pf-Prioritylevel-Uid': '65342b8c-d58f-47e4-b499-73e62bb393da', 'Date': 'Wed, 01 Mar 2023 13:40:58 GMT', 'Content-Length': '451')>
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"daskclusters.gateway.dask.org is forbidden: User \"system:serviceaccount:visnovsky1-ns:api-dhub-dask-gateway\" cannot list resource \"daskclusters\" in API group \"gateway.dask.org\" at the cluster scope: RBAC: clusterrole.rbac.authorization.k8s.io \"fleet-content\" not found","reason":"Forbidden","details":{"group":"gateway.dask.org","kind":"daskclusters"},"code":403}

Hi @VladimirVisnovsky, welcome to the forum!

Could you please provide the configuration you use to deploy dask-gateway (values.yaml, command)? Did you use directly dask-gateway or daskhub helm chart?

Also pinging @jacobtomlinson as this looks like it needs deep Kubernetes knowledge.

Thank you!

  1. Configuration I used for dask-gateway:
dask-gateway:
  enabled: true
  gateway:
    auth:
      jupyterhub:
        apiToken: b2ff1f8f489ad89469d6b92272441a4a43a8713110e766da28ad74e9c84d8fb3 # replace this 
        apiUrl: http://proxy-public/hub/api
      type: jupyterhub
    livenessProbe:
      # Enables the livenessProbe.
      enabled: true
      # Configures the livenessProbe.
      initialDelaySeconds: 5
      timeoutSeconds: 2
      periodSeconds: 10
      failureThreshold: 6
    readinessProbe:
      # Enables the readinessProbe.
      enabled: true
      # Configures the readinessProbe.
      initialDelaySeconds: 5
      timeoutSeconds: 2
      periodSeconds: 10
      failureThreshold: 3
    extraConfig:
      dasklimits: |
        c.ClusterConfig.cluster_max_cores = 6
        c.ClusterConfig.cluster_max_memory = "24 G"
        c.ClusterConfig.cluster_max_workers = 4
        c.ClusterConfig.idle_timeout = 1800
      optionHandler: |
        from dask_gateway_server.options import Options, Integer, Float, String

        def options_handler(options):
          if ":" not in options.image:
            raise ValueError("When specifying an image you must also provide a tag")
          return {
            "worker_cores": options.worker_cores,
            "worker_memory": int(options.worker_memory * 2 ** 30),
            "image": options.image,
          }

        c.Backend.cluster_options = Options(
          Integer("worker_cores", default=1, min=1, max=4, label="Worker Cores"),
          Float("worker_memory", default=2, min=2, max=8, label="Worker Memory (GiB)"),
          String("image", default="pangeo/pangeo-notebook:2022.09.21", label="Image"),
          handler=options_handler,
        )
    prefix: /services/dask-gateway
    backend:
      scheduler:
        extraContainerConfig:
          securityContext:
            runAsUser: 1000
            runAsGroup: 1000
      worker:
        extraContainerConfig:
          securityContext:
            runAsUser: 1000
            runAsGroup: 1000
  traefik:
    service:
      type: ClusterIP

dask-kubernetes:
  enabled: false

rbac:
  enabled: true

jupyterhub:
  hub:
    config:
      Authenticator:
        admin_users:
        - admin
      JupyterHub:
        admin_access: true
        authenticator_class: nativeauthenticator.NativeAuthenticator          
    services:
      dask-gateway:
        apiToken: b2ff1f8f489ad89469d6b92272441a4a43a8713110e766da28ad74e9c84d8fb3 # replace this
    networkPolicy:
      interNamespaceAccessLabels: "accept"
  ingress:
    annotations:
      kubernetes.io/ingress.class: nginx
      cert-manager.io/cluster-issuer: "letsencrypt-prod"
    enabled: true
    hosts:
      - daskgateway.dyn.cloud.e-infra.cz # replace this with your DNS name
    tls:
      - hosts:
        - daskgateway.dyn.cloud.e-infra.cz # replace this with your DNS name
        secretName: daskgateway-dyn-cloud-e-infra-cz # replace this with your DNS name
  proxy:
    secretToken: 04bc4720901afa229c7c6832592c2d53bd17e45035fe5bbc23693c785b0c3913 # replace this 
    service:
      type: ClusterIP
  singleuser:
    networkPolicy:
      enabled: false
    cloudMetadata:
      blockWithIptables: false
    cpu:
      guarantee: 1
      limit: 2
    defaultUrl: /lab
    extraEnv:
      DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE: '{JUPYTER_IMAGE_SPEC}'
    image:
      name: pangeo/pangeo-notebook
      tag: 2022.09.21
    memory:
      guarantee: 2G
      limit: 4G
    startTimeout: 600
    storage:
      capacity: 2Gi
    extraPodConfig:
      securityContext:
        fsGroupChangePolicy: OnRootMismatch
  scheduling:
    userScheduler:
      enabled: false
  prePuller:
    hook:
      enabled: false
    continuous:
      enabled: false  
  1. Command that I deploy Dask with.
    I am using downloaded helm repo (helm pull daskhub --repo=https://helm.dask.org) where I change everywhere “ClusterRole → Role” and “ClusterRoleBinding → RoleBinding” in rbac.yaml files.

gateway/rbac.yaml (the same for controller/rbac.yaml and traefik/rbac.yaml):

{{- if .Values.rbac.enabled -}}
{{- if not .Values.rbac.gateway.serviceAccountName -}}
apiVersion: v1
kind: ServiceAccount
metadata:
  name: {{ include "dask-gateway.apiName" . }}
  labels:
    {{- include "dask-gateway.labels" . | nindent 4 }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role #ClusterRole
metadata:
  name: {{ include "dask-gateway.apiName" . }}
  labels:
    {{- include "dask-gateway.labels" . | nindent 4 }}
rules:
  - apiGroups: [""]
    resources: ["secrets"]
    verbs: ["get"]
  - apiGroups: ["gateway.dask.org"]
    resources: ["daskclusters"]
    verbs: ["*"]
---
kind: RoleBinding #ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: {{ include "dask-gateway.apiName" . }}
  labels:
    {{- include "dask-gateway.labels" . | nindent 4 }}
subjects:
  - kind: ServiceAccount
    name: {{ include "dask-gateway.apiName" . }}
    namespace: {{ .Release.Namespace }}
roleRef:
  kind: Role #ClusterRole
  name: {{ include "dask-gateway.apiName" . }}
  apiGroup: rbac.authorization.k8s.io
{{- end }}
{{- end }}

Command:

helm upgrade dhub /Path/to/daskhub/repo --install --render-subchart-notes  --wait --values=daskhub.yaml

Please, let me know if you need the whole repo.

Clearly not a Kubernetes expert here, so probably a silly question, but is this normal you don’t specify your namespace in the helm command?

It is not necessary to specify the namespace explicitly if one has configured ~/.kube/config with the desired namespace (which I have).

I’m not 100% sure but I thought Dask Gateway could only be installed at the cluster level with a cluster role.

If you’re just trying to run Dask clusters on Kubernetes in your own namespace and you have permissions to use kubectl I recommend you look at dask-kubernetes.

Are you trying to deploy Dask as a service for your team or just yourself?

I’m trying to deploy Dask as a service into multi-user environment.

Do you think that it is possible to somehow “bypass” the Gateway (or imitate it’s functionality) by using dask-kubernetes?

If all of the users have access to the Kubernetes API then Dask-kubernetes is probably your best choice anyway. If you’re trying to totally abstract Kubernetes away then Gateway is your best choice. So it depends what you’re trying to do.

Just wanted to check back in here. With the latest release of dask-kubernetes it now supports installing in a single namespace.

https://kubernetes.dask.org/en/latest/operator_installation.html#single-namespace