I am using dask distributed package to create a EC2/ECS cluster, I want to read the ML models within the workers, something like
def read_model(model_path):
model = pickle.load(model_path)
return model
future = client.submit(read_model, model_path)
model = future.result()
How do I mount/volume the local folder while creating the ECS/EC2 cluster using python.
tried to look into the parameters provided here
Hi @hjain371, welcome here.
To which local folder are your referring to? Is this some folder on another AWS VM? Is this a NFS volume or something like that?
In general, when working on cloud computing facility, it is recommended to store your data in a shared object store, couldn’t you use AWS S3 to store your model?
As you said we might move to s3 for this but here in this scenario the local folder I mean my local host, I had mounted the folder at the time of docker run but when I try to access with cluster workers , I get an error “path not found”.
I can found few parameters in Amazon Web Services (AWS) — Dask Cloud Provider 0+untagged.50.gef21317 documentation
like
instance_tags=None , volume_tags=None
but don’t found any info on how to use it.
You can’t (okay, you can probably with a some complex setup and expert networking skills) mount your local laptop or computer file system on a VM executing in the AWS cloud. Or there is something I’m not understanding with the setup you are talking about.
What are you doing with Docker argument? Does the VM on which Dask Worker will be run can actually see your file system?
As the documentation states:
volume_tags: dict, optional
Tags to be applied to all EBS volumes upon creation. By default, includes “createdBy”: “dask-cloudprovider”
Nothing to do with mounting any file system.
Yes while running the docker image we are mounting the local folder
docker run --platform linux/amd64
-p 8899:8888 -p 8900:8787 -v $(HOME)/.aws:/root/.aws:ro
-v $(PROJECTX_DATA):/data
-v $(PROJECTX_MODELS):/models
-v $(PROJECTX)/notebooks:/notebooks
$(APP_NAME) $(PROJECT_ARGS)
Could you again tell me where are located these folders (PROJECTX_*
)?
If it’s on your laptop, then this won’t work. The docker containers with dask-cloudprovider are launch on AWS instances. If these folders don’t exist on these instances, then you won’t see anything in your container.
It sounds like you might want to use the client.upload_file
method to upload a directory
txs, I found a nice blog on coiled also for the same and hopefully this will sort the issue
I’m curious, which article?