After executing this code, it gives the following warning and error:
Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 44871 instead
2023-02-03 15:06:17,523 - distributed.preloading - INFO - Creating preload: dask_cuda.initialize
2023-02-03 15:06:17,523 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize
2023-02-03 15:06:17,703 - distributed.worker - WARNING - Mismatched versions found
event = cls(**kwargs)
TypeError: __init__() missing 1 required positional argument: 'run_id'
2023-02-03 15:05:34,013 - distributed.nanny - ERROR - Worker process died unexpectedly
Even after this error, I can see Python using CPU and GPU. Is it expected behavior?
It really depends on what caused the error, how many workers you have at first, and a lot of other concerns. There might be other workers still alive. Dask might also be trying to launch a new Worker and process the data. Or the Worker died not cleanly and some process is still doing something. You can maybe inspect all that using the Dask Dashboard.
But anyway, in my opinion it would be better to solve the error you got in the first place, don’t you agree?
First the warning, I’m not sure how you can have mismatched version providing your start the cluster from your main script?
Then the positional argument error, but it might be consequence of the Warning message. Do you have more details to add on your setup?
I am running this code on a laptop locally on Ubuntu 20.04 with GTX 1660 Ti card in the rapids-22.12 environment and not on HPC.
You are right that the old processes may not shut down as I try to run this code. I tried with client.shutdown() at the end of the code also. But it is not helpful.
The features are floats64 and int64 types and are scaled with Standardscaler. I am trying hyperparameter optimization, But I could not make it run with dask_ml and CuML. Something similar runs with scikit-learn commands.
Please let me know if you need some specific information.