Hi,
I forward here a message posted on stackoverflow
Here is the failing code
import dask.dataframe as dd
from dask.distributed import Client, LocalCluster
import pandas as pd
local_file = 'example.csv'
df0 = pd.DataFrame({'id':[0,1,3], 'model':['A', 'B', 'C']})
df0.to_csv(local_file)
if __name__ == '__main__':
with LocalCluster(processes=False) as cluster, Client(cluster) as client:
df = dd.read_csv(local_file)
print('df :')
print(df.compute())
df.head()
Inside the local cluster/client, the df.compute doesn’t return pandas dataframe with values inside, but rather a “Serialize” graph.
And then df.head returns error. Which is not the case outer of the client.
Doesn’t anyone can fix this seeming bug ?