Gateway server to mount per-user volumes onto dask worker pods

Hello, we have a jupyerhub + dask-gateway server deployment on k8s based on the public chart “daskhub”. It sets up a gateway-server instance as a jupterhub service. Users authenticate against gateway server using per-user jupyterhub API token and spin up new dask clusters on demand.
We now want to configure gateway server to dynamically mount per-user volumes onto dask worker pods.

For a quick test, I was able to mount a SHARED volume for All Users and worker pods by adding “c.KubeClusterConfig.worker_extra_container_config” and “c.KubeClusterConfig.worker_extra_pod_config” to section "gateway–backend-extraConfig’.

gateway:
  extraConfig:
      clusteroptions: |

          c.KubeClusterConfig.worker_extra_container_config = {
         
            "volumeMounts": [
                             {"mountPath": "/mnt", "name": "volume-shared"}
                            ]         
           }
          c.KubeClusterConfig.worker_extra_pod_config = {
            "volumes" : [
                         {"name": "volume-shared", "persistentVolumeClaim": {"claimName": "claim-SHARED"}}                         
                       ] 
            } 

But how to mount per-user volumes? For example, UserX has an existing volume VolX, and we want to mount VolX to the dask worker pods created by UserX.

Thanks

Just an update
we were able to dynamically mount per-user volumes to dask worker pods by modifying this example
https://gateway.dask.org/cluster-options.html#user-specific-configuration

gateway:
  extraConfig:
      clusteroptions: |
          from dask_gateway_server.options import Options

          def options_handler(options, user):
            hub_username = user.name
            per_user_claim = "claim-{}".format(hub_username)
            return {
                     "worker_extra_container_config": {
                                                        "volumeMounts": [
                                                                         { "mountPath": "/home/jovyan", "name": "per-user-storage" }
                                                                        ]                                                                      
                                                      },
                     "worker_extra_pod_config": {
                                                  "volumes" : [
                                                               { "name": "per-user-storage", "persistentVolumeClaim": {"claimName": per_user_claim} }
                                                              ]
                                                }
                   }
          c.Backend.cluster_options = Options(handler=options_handler)

1 Like

But I notice when mounting per-user volume, worker pod spin-up time is much longer than just mounting a Shared volume (hardcode in the configs). I don’t understand why, both the shared and per-user volumes already exist before the mount.

But how to mount per-user volumes? For example, UserX has an existing volume VolX, and we want to mount VolX to the dask worker pods created by UserX.

@drew Could you please share some more details about why you’re looking to do this?

I’m not very familiar with KubeClusterm, but I think @jacobtomlinson might have helpful ideas. :slightly_smiling_face:

The pod startup time will be related to how your cluster provisions volumes, which you haven’t mentioned. It is likely due to having both volumes assigned to the pod may make placement harder and things may need to be changed on the node to accommodate this.