TypeEroor list indices worker Install for AWS Ec2 Cluster

Hello

   Receiving error on Worker startup  today 
      Process Dask Worker process (from Nanny) 
       dask/config.py line 148 -- in update old[k] = v 
   Type Error: list indices must be integers or slice not str

This same code was working perfectly all last week but only started to have issues
today .

The startup code is

extra_bootstrap = [
 "sudo apt-get install software-properties-common",
 "sudo add-apt-repository ppa:deadsnakes/ppa  -y ",
 "sudo apt install ntpdate", 
 "sudo ntpdate time.nist.gov",
 "sudo apt install build-essential",
 "sudo apt install gcc",
 "sudo apt-get update  -y ",
 "sudo apt-get install python3.7 -y",
 "python -m pip install awscli",
 "python -m pip install boto botocore",
 "python -m pip install jupyter-server-proxy", 
 "export DASK_DISTRIBUTED__LOGGING=debug"
 ]

req_dask_worker_cnt=2
py_packages={"EXTRA_PIP_PACKAGES":"s3fs asyncio lightgbm panda scikit-learn seaborn scipy scikit-learn-intelex matplotlib graphviz joblib nbconvert dpcpp-cpp-rt aiobotocore boto3 aioboto3==11.2.0 "}
req_debug_mode=True
req_auto_shutdown=False
req_docker_args=""
worker_instance_type='r6id.8xlarge' #r means transaction, id -> disk
scheduler_instance_type='m6i.4xlarge' #m means memory, i -> no disk
region_tag="us-east-2"
docker_image_tag="ghcr.io/dask/dask"

cluster = EC2Cluster(
    env_vars=py_packages,
    debug=req_debug_mode,
    filesystem_size=800,
    docker_image=docker_image_tag,
    security_groups=["sg-0952a59ff138d0edf"],
    worker_instance_type=worker_instance_type,
    scheduler_instance_type=scheduler_instance_type,
    iam_instance_profile={'Arn': 'arn:aws:iam::xxxx:instance-profile/DaskThorProfile'},
    n_workers=req_dask_worker_cnt,
    security=False,
    key_name='awsthoranalytics',
    extra_bootstrap=extra_bootstrap,
    auto_shutdown=False,
    instance_tags={"application":"daskthor"}, # can be set to any name you want
    vpc="vpc-0090c55c8d60b5657", 
    subnet_id='subnet-04f80ecea51538ed8',
    availability_zone="us-east-2a",
    region='us-east-2',
    ########################################
    enable_detailed_monitoring=True,
    # private / public is not relevant to the performance, only influences the security
    use_private_ip=False
)

cluster.wait_for_workers(req_dask_worker_cnt)
print("Status - Workers online\n\n ")


Please let me know if you can assist
Thanks

If the code worked last week, this might be an environment mismatch between Client and Scheduler/Workers. I guess the latest Dask docker image doesn’t work anymore with your local environment.

Could you try to specify an image tag, or to update the environment in which you run the code above?

There have been no changes in the environment or the code. After getting the error I changed the docker image from this

ghcr.io dask dask:latest

to this

ghcr.io dask dask:2023.8.1-py3.10

The workers then come online.

Today I just tried the same code with only changing the docker image back to the latest
ghcr.io dask dask:latest

and the workers never come online. They just re-iterate thru the loop because of the error TypeError being thrown in dask config.,py .

The environment is using the default docker image with no changes.
The execution environment is AWS , which i am using a standard instance with an AWS ubuntu AMI.

Are there any parameters or flags i can add to get more details as to what the config process is trying to determine and maybe pass it thru the api ?

Can you check what version of Dask you have installed locally? If setting the image to 2023.8.1 resolves the issue I would guess you have 2023.8.1 locally too.

If you install 2023.9.0 locally and use the 2023.9.0 image does that work?

Here are the versions across the cluster :
{‘scheduler’: {‘host’: {‘python’: ‘3.10.12.final.0’,
‘python-bits’: 64,
‘OS’: ‘Linux’,
‘OS-release’: ‘5.15.0-1043-aws’,
‘machine’: ‘x86_64’,
‘processor’: ‘x86_64’,
‘byteorder’: ‘little’,
‘LC_ALL’: ‘C.UTF-8’,
‘LANG’: ‘C.UTF-8’},
‘packages’: {‘python’: ‘3.10.12.final.0’,
‘dask’: ‘2023.8.1’,
‘distributed’: ‘2023.8.1’,
‘msgpack’: ‘1.0.5’,
‘cloudpickle’: ‘2.2.1’,
‘tornado’: ‘6.3.3’,
‘toolz’: ‘0.12.0’,
‘numpy’: ‘1.25.2’,
‘pandas’: ‘2.0.3’,
‘lz4’: ‘4.3.2’}},
‘workers’: {‘tcp://11.0.0.19:44155’: {‘host’: {‘python’: ‘3.10.12.final.0’,
‘python-bits’: 64,
‘OS’: ‘Linux’,
‘OS-release’: ‘5.15.0-1043-aws’,
‘machine’: ‘x86_64’,
‘processor’: ‘x86_64’,
‘byteorder’: ‘little’,
‘LC_ALL’: ‘C.UTF-8’,
‘LANG’: ‘C.UTF-8’},
‘packages’: {‘python’: ‘3.10.12.final.0’,
‘dask’: ‘2023.8.1’,
‘distributed’: ‘2023.8.1’,
‘msgpack’: ‘1.0.5’,
‘cloudpickle’: ‘2.2.1’,
‘tornado’: ‘6.3.3’,
‘toolz’: ‘0.12.0’,
‘numpy’: ‘1.25.2’,
‘pandas’: ‘2.0.3’,
‘lz4’: ‘4.3.2’}},
‘tcp://11.0.0.91:43891’: {‘host’: {‘python’: ‘3.10.12.final.0’,
‘python-bits’: 64,
‘OS’: ‘Linux’,
‘OS-release’: ‘5.15.0-1043-aws’,
‘machine’: ‘x86_64’,
‘processor’: ‘x86_64’,
‘byteorder’: ‘little’,
‘LC_ALL’: ‘C.UTF-8’,
‘LANG’: ‘C.UTF-8’},
‘packages’: {‘python’: ‘3.10.12.final.0’,
‘dask’: ‘2023.8.1’,
‘distributed’: ‘2023.8.1’,
‘msgpack’: ‘1.0.5’,
‘cloudpickle’: ‘2.2.1’,
‘tornado’: ‘6.3.3’,
‘toolz’: ‘0.12.0’,
‘numpy’: ‘1.25.2’,
‘pandas’: ‘2.0.3’,
‘lz4’: ‘4.3.2’}}},
‘client’: {‘host’: {‘python’: ‘3.10.12.final.0’,
‘python-bits’: 64,
‘OS’: ‘Linux’,
‘OS-release’: ‘5.15.0-1040-aws’,
‘machine’: ‘x86_64’,
‘processor’: ‘x86_64’,
‘byteorder’: ‘little’,
‘LC_ALL’: ‘None’,
‘LANG’: ‘C.UTF-8’},
‘packages’: {‘python’: ‘3.10.12.final.0’,
‘dask’: ‘2023.7.1’,
‘distributed’: ‘2023.7.1’,
‘msgpack’: ‘1.0.5’,
‘cloudpickle’: ‘2.2.1’,
‘tornado’: ‘6.3.2’,
‘toolz’: ‘0.12.0’,
‘numpy’: ‘1.25.2’,
‘pandas’: ‘2.0.3’,
‘lz4’: ‘4.3.2’}}}

so to use the latest version of the docker image i will need to upgrade all the dask components ?
Thank you . I may need to wait to do that since we are looking to complete the proof of concept