Hi I’m having a hard time figuring out why when I configure my dask clusters (created via gateway) to have multiple processes (instead of threads) for my cores. I only end up with 1 process that is not killed. I do end up seeing logs that indicate that multiple were started but only 1 actually registers. Greatly appreciate any help in debugging this
Running command: ['/home/dask/dask_worker.runfiles/__main__/dask/dask_worker.py', '--nthreads', '1', '--no-dashboard', '--death-timeout', '90', '--memory-limit', '0', '--nprocs', '10', 'tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786']
distributed.nanny - INFO - Start Nanny at: 'tls://xx.xx.xx.x:38085'
distributed.nanny - INFO - Start Nanny at: 'tls://xx.xx.xx.x:45103'
distributed.nanny - INFO - Start Nanny at: 'tls://xx.xx.xx.x:33385'
distributed.nanny - INFO - Start Nanny at: 'tls://xx.xx.xx.x:33421'
distributed.nanny - INFO - Start Nanny at: 'tls://xx.xx.xx.x:41145'
distributed.nanny - INFO - Start Nanny at: 'tls://xx.xx.xx.x:38879'
distributed.nanny - INFO - Start Nanny at: 'tls://xx.xx.xx.x:39257'
distributed.nanny - INFO - Start Nanny at: 'tls://xx.xx.xx.x:34977'
distributed.nanny - INFO - Start Nanny at: 'tls://xx.xx.xx.x:41521'
distributed.nanny - INFO - Start Nanny at: 'tls://xx.xx.xx.x:36827'
distributed.worker - INFO - Start worker at: tls://xx.xx.xx.x:33443
distributed.worker - INFO - Listening to: tls://xx.xx.xx.x:33443
distributed.worker - INFO - dashboard at: xx.xx.xx.x:32939
distributed.worker - INFO - Waiting to connect to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Threads: 1
distributed.worker - INFO - Local Directory: /home/dask/dask_worker.runfiles/__main__/dask-worker-space/worker-ue4pyb6w
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Start worker at: tls://xx.xx.xx.x:33929
distributed.worker - INFO - Listening to: tls://xx.xx.xx.x:33929
distributed.worker - INFO - dashboard at: xx.xx.xx.x:41447
distributed.worker - INFO - Waiting to connect to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Threads: 1
distributed.worker - INFO - Local Directory: /home/dask/dask_worker.runfiles/__main__/dask-worker-space/worker-7eq8azho
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Registered to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.core - INFO - Starting established connection
distributed.worker - INFO - Start worker at: tls://xx.xx.xx.x:36491
distributed.worker - INFO - Listening to: tls://xx.xx.xx.x:36491
distributed.worker - INFO - dashboard at: xx.xx.xx.x:42831
distributed.worker - INFO - Waiting to connect to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Threads: 1
distributed.worker - INFO - Local Directory: /home/dask/dask_worker.runfiles/__main__/dask-worker-space/worker-od46okqy
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Registered to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.core - INFO - Starting established connection
distributed.worker - INFO - Registered to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.core - INFO - Starting established connection
distributed.worker - INFO - Start worker at: tls://xx.xx.xx.x:39177
distributed.worker - INFO - Listening to: tls://xx.xx.xx.x:39177
distributed.worker - INFO - dashboard at: xx.xx.xx.x:33535
distributed.worker - INFO - Waiting to connect to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Threads: 1
distributed.worker - INFO - Local Directory: /home/dask/dask_worker.runfiles/__main__/dask-worker-space/worker-9k21u5mu
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Start worker at: tls://xx.xx.xx.x:39071
distributed.worker - INFO - Listening to: tls://xx.xx.xx.x:39071
distributed.worker - INFO - dashboard at: xx.xx.xx.x:46617
distributed.worker - INFO - Waiting to connect to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Threads: 1
distributed.worker - INFO - Local Directory: /home/dask/dask_worker.runfiles/__main__/dask-worker-space/worker-tp9q9sov
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Registered to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.core - INFO - Starting established connection
distributed.worker - INFO - Registered to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.core - INFO - Starting established connection
distributed.worker - INFO - Start worker at: tls://xx.xx.xx.x:35977
distributed.worker - INFO - Listening to: tls://xx.xx.xx.x:35977
distributed.worker - INFO - dashboard at: xx.xx.xx.x:36893
distributed.worker - INFO - Waiting to connect to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Threads: 1
distributed.worker - INFO - Local Directory: /home/dask/dask_worker.runfiles/__main__/dask-worker-space/worker-d1_sjtuv
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Registered to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.core - INFO - Starting established connection
distributed.worker - INFO - Start worker at: tls://xx.xx.xx.x:40385
distributed.worker - INFO - Listening to: tls://xx.xx.xx.x:40385
distributed.worker - INFO - dashboard at: xx.xx.xx.x:44713
distributed.worker - INFO - Waiting to connect to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Threads: 1
distributed.worker - INFO - Local Directory: /home/dask/dask_worker.runfiles/__main__/dask-worker-space/worker-ewxj01cd
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Registered to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.core - INFO - Starting established connection
distributed.worker - INFO - Start worker at: tls://xx.xx.xx.x:42931
distributed.worker - INFO - Listening to: tls://xx.xx.xx.x:42931
distributed.worker - INFO - dashboard at: xx.xx.xx.x:42455
distributed.worker - INFO - Waiting to connect to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Threads: 1
distributed.worker - INFO - Local Directory: /home/dask/dask_worker.runfiles/__main__/dask-worker-space/worker-r70fh3nz
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Start worker at: tls://xx.xx.xx.x:37297
distributed.worker - INFO - Listening to: tls://xx.xx.xx.x:37297
distributed.worker - INFO - dashboard at: xx.xx.xx.x:32853
distributed.worker - INFO - Waiting to connect to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Threads: 1
distributed.worker - INFO - Local Directory: /home/dask/dask_worker.runfiles/__main__/dask-worker-space/worker-foi0bi1v
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Start worker at: tls://xx.xx.xx.x:38107
distributed.worker - INFO - Listening to: tls://xx.xx.xx.x:38107
distributed.worker - INFO - dashboard at: xx.xx.xx.x:33155
distributed.worker - INFO - Waiting to connect to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Threads: 1
distributed.worker - INFO - Local Directory: /home/dask/dask_worker.runfiles/__main__/dask-worker-space/worker-48022gbb
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Registered to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.core - INFO - Starting established connection
distributed.worker - INFO - Registered to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.core - INFO - Starting established connection
distributed.worker - INFO - Registered to: tls://dask-2fb8c4a2a57e49eca6bd276258effa04.daskgateway:8786
distributed.worker - INFO - -------------------------------------------------
distributed.core - INFO - Starting established connection
distributed.worker - INFO - Stopping worker at tls://xx.xx.xx.x:42931
distributed.worker - INFO - Stopping worker at tls://xx.xx.xx.x:35977
distributed.worker - INFO - Stopping worker at tls://xx.xx.xx.x:39177
distributed.worker - INFO - Stopping worker at tls://xx.xx.xx.x:36491
distributed.worker - INFO - Stopping worker at tls://xx.xx.xx.x:37297
distributed.worker - INFO - Stopping worker at tls://xx.xx.xx.x:39071
distributed.worker - INFO - Stopping worker at tls://xx.xx.xx.x:38107
distributed.worker - INFO - Stopping worker at tls://xx.xx.xx.x:33929
distributed.worker - INFO - Stopping worker at tls://xx.xx.xx.x:40385
distributed.nanny - INFO - Worker closed
distributed.nanny - INFO - Worker closed
distributed.nanny - INFO - Worker closed
distributed.nanny - INFO - Worker closed
distributed.nanny - INFO - Worker closed
distributed.nanny - INFO - Worker closed
distributed.nanny - INFO - Worker closed
distributed.nanny - INFO - Worker closed
distributed.nanny - INFO - Worker closed
distributed.nanny - INFO - Closing Nanny at 'tls://xx.xx.xx.x:38879'
distributed.nanny - INFO - Closing Nanny at 'tls://xx.xx.xx.x:36827'
distributed.nanny - INFO - Closing Nanny at 'tls://xx.xx.xx.x:41145'
distributed.nanny - INFO - Closing Nanny at 'tls://xx.xx.xx.x:41521'
distributed.nanny - INFO - Closing Nanny at 'tls://xx.xx.xx.x:39257'
distributed.nanny - INFO - Closing Nanny at 'tls://xx.xx.xx.x:34977'
distributed.nanny - INFO - Closing Nanny at 'tls://xx.xx.xx.x:33421'
distributed.nanny - INFO - Closing Nanny at 'tls://xx.xx.xx.x:38085'
distributed.nanny - INFO - Closing Nanny at 'tls://xx.xx.xx.x:45103'