I’m encountering an issue where my Dask cluster submission code runs perfectly in JupyterLab, but fails when executed in PyCharm. The exact same code is used in both environments. I’m looking for guidance on what might be causing this discrepancy and how to resolve it.
Hi @Payyalar, welcome to Dask community,
Your post lacks a bit of context. Which code are you trying to run? What error do you get? Are you running in the same Python environement? On the same machine?
Import dask.dataframe as dd
Import dask.arrray as da
From dask.distributed import client
Class DataProcess():
Def calculate_mean(self,df)
Return df.mean()
Def process_data(self, client):
data = da.random.random((100,3), chunks=(10,3))
df = dd.from_dask_array(data, columns=[‘A’,’ B’,’C’])
Mean_future = client.submit(self.calculate_mean,df)
Return mean_future
Client = Client(“scheduler tcp”)
Persisted_result = Dataprocess().process_data(client)
Print(persisted_result.result())
The ran the above code on both jupyter and pycharm.
I can actually see the spike in task stream and progress tab while I ran it with jupyter.
But, I couldn’t see any submission happening while running in pycharm. Execution completin both.
I believe that the computation is happening locally rather than on the dask cluster with pychram.
Note: Environment and client are the same with both.
Your code casing is a bit strange, and with typos too. It’s also a bit complicated to read. Could you just try with something like:
from dask.distributed import Client
import dask.dataframe as dd
import dask.array as da
client = Client("scheduler tcp")
data = da.random.random((100,3), chunks=(10,3))
df = dd.from_dask_array(data, columns=['A','B','C'])
print(df.mean().compute())
I believe you mean the Dask cluster you are using, so the schedulr TCP address, is the same in both. If so there is no reason you do not see the submission on the Dashboard when running from PyCharm.