Gateway server to mount per-user volumes onto dask worker pods

drew · June 19, 2022, 7:53pm

Hello, we have a jupyerhub + dask-gateway server deployment on k8s based on the public chart “daskhub”. It sets up a gateway-server instance as a jupterhub service. Users authenticate against gateway server using per-user jupyterhub API token and spin up new dask clusters on demand.
We now want to configure gateway server to dynamically mount per-user volumes onto dask worker pods.

For a quick test, I was able to mount a SHARED volume for All Users and worker pods by adding “c.KubeClusterConfig.worker_extra_container_config” and “c.KubeClusterConfig.worker_extra_pod_config” to section "gateway–backend-extraConfig’.

gateway:
  extraConfig:
      clusteroptions: |

          c.KubeClusterConfig.worker_extra_container_config = {
         
            "volumeMounts": [
                             {"mountPath": "/mnt", "name": "volume-shared"}
                            ]         
           }
          c.KubeClusterConfig.worker_extra_pod_config = {
            "volumes" : [
                         {"name": "volume-shared", "persistentVolumeClaim": {"claimName": "claim-SHARED"}}                         
                       ] 
            }

But how to mount per-user volumes? For example, UserX has an existing volume VolX, and we want to mount VolX to the dask worker pods created by UserX.

Thanks

drew · June 20, 2022, 2:41pm

Just an update
we were able to dynamically mount per-user volumes to dask worker pods by modifying this example
https://gateway.dask.org/cluster-options.html#user-specific-configuration

gateway:
  extraConfig:
      clusteroptions: |
          from dask_gateway_server.options import Options

          def options_handler(options, user):
            hub_username = user.name
            per_user_claim = "claim-{}".format(hub_username)
            return {
                     "worker_extra_container_config": {
                                                        "volumeMounts": [
                                                                         { "mountPath": "/home/jovyan", "name": "per-user-storage" }
                                                                        ]                                                                      
                                                      },
                     "worker_extra_pod_config": {
                                                  "volumes" : [
                                                               { "name": "per-user-storage", "persistentVolumeClaim": {"claimName": per_user_claim} }
                                                              ]
                                                }
                   }
          c.Backend.cluster_options = Options(handler=options_handler)

drew · June 20, 2022, 2:49pm

But I notice when mounting per-user volume, worker pod spin-up time is much longer than just mounting a Shared volume (hardcode in the configs). I don’t understand why, both the shared and per-user volumes already exist before the mount.

pavithraes · July 11, 2022, 6:06pm

But how to mount per-user volumes? For example, UserX has an existing volume VolX, and we want to mount VolX to the dask worker pods created by UserX.

@drew Could you please share some more details about why you’re looking to do this?

I’m not very familiar with KubeClusterm, but I think @jacobtomlinson might have helpful ideas.

jacobtomlinson · August 17, 2022, 10:15am

The pod startup time will be related to how your cluster provisions volumes, which you haven’t mentioned. It is likely due to having both volumes assigned to the pod may make placement harder and things may need to be changed on the node to accommodate this.

Topic		Replies	Views
Gateway profiles in kubernetes Deploying Dask	1	86	February 28, 2024
Deploying Dask on an rke2 custom cluster Deploying Dask	8	202	April 4, 2024
Deploying Dask without cluster permissions (only with namespace permissions) Deploying Dask dask-gateway , kubernetes , distributed	11	620	October 11, 2024
Worker pods exist but client cannot connect to them or workers do not accept jobs Deploying Dask dask-gateway , kubernetes , distributed	7	76	June 27, 2024
Deploy dask gateway on Kubernetes as a JupyterHub service Deploying Dask kubernetes	8	1439	January 29, 2022

Gateway server to mount per-user volumes onto dask worker pods

Related topics