chensj
March 2, 2023, 9:34am
1
Greattings:
I use command “dask-scheduler --port 8786” start a scheduler,and want to use tcp to submit a demo.But dashboard always shows “total:117,waiting:117”.I am new to dask,I don’t know why it not work.When I do it locally, it worked.There is my code:
client = Client("tcp://xxxx:8786")
# client = Client()
ddf = dd.read_parquet(
"s3://dask-data/nyc-taxi/nyc-2015.parquet/part.*.parquet",
columns=["passenger_count", "tip_amount"],
storage_options={"anon": True}
)
result = ddf.groupby("passenger_count").tip_amount.mean().compute()
Thank you for helping me.
Hi @chensj , welcome here!
You said you started a dask-scheduler, but did you also started at least one Worker to process the tasks?
What is your local code when it works?
chensj
March 2, 2023, 10:05am
3
Hi @ guillaumeeb:
Local code is
client = Client()
ddf = dd.read_parquet(
"s3://dask-data/nyc-taxi/nyc-2015.parquet/part.*.parquet",
columns=["passenger_count", "tip_amount"],
storage_options={"anon": True}
)
result = ddf.groupby("passenger_count").tip_amount.mean().compute()
Maybe I don’t start any workers.
How can I start a worker?
See Command Line — Dask documentation , you need to use dask worker
command.
However, if your on a single server, using Client
or LocalCluster
will be quite equivalent.