Hi @guillaumeeb
Thanks for the response.
It appears you are exploding the Pandas dataframe and not the Dask dataframe.
I also want a solution that will work if the length of the list values is arbitrary or not known in advance as that could impact how the meta parameter is constructed
Please note that all list entries in a column will have the same length. However, this length can vary and cannot always be deterministic in advance.
This is the current Pandas dataframe that works
Pandas DataFrame Approach
import pandas as pd
data = {‘list_column’: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]}
df = pd.DataFrame(data)
df_exploded = df.explode(‘list_column’)
df_horizontal = df_exploded.pivot_table(index=df_exploded.index, columns=df_exploded.groupby(df_exploded.index).cumcount(), values=‘list_column’)
df_horizontal
However, from the pure Dask Dataframe point of view, this is what I am actually looking for which I have not been able to get to work as I am always getting a ValueError: Grouper and axis must be same length
Dask DataFrame Approach
import pandas as pd
import dask.dataframe as dd
data = {‘list_column’: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]}
pdf = pd.DataFrame(data)
ddf = dd.from_pandas(pdf, npartitions=1)
ddf_exploded = ddf.explode(‘list_column’)
ddf_horizontal = ddf_exploded.pivot_table(index=ddf_exploded.index, columns=ddf_exploded.groupby(df_exploded.index).cumcount(), values=‘list_column’)
ddf_horizontal.compute()
Thank you so much for your help in this regard.