When I use assign and apply to create a new column for my dataframe, the meta attribute doesn’t work. As shown below, the meta is “int”, while the result is “float64”. In the meantime, no error or warning is reported.
train_data = train_data.assign(current=train_data[‘sequence_result’].apply(lambda x: x[0], meta=‘int’))
Here, the “sequence_result” is a list such as [133669542676, 1, 148, -1, 133658378700, 0].
Hi @lensory and welcome! If I’m understanding correctly, sequence_result
is a column of lists, where each element is an integer? I created a small minimal reproducer, but wasn’t able to reproduce your result:
import dask.dataframe as dd
import pandas as pd
df = pd.DataFrame({'a': range(0, 3), 'b': ['x', 'y', 'z'], 'c': [[1, 2, 3] for _ in range(3)]})
ddf = dd.from_pandas(df, npartitions=2)
df = ddf.assign(current=ddf['c'].apply(lambda x: x[0], meta=int)).compute().dtypes
Would you be able to share a minimally reproducible example that shows this behavior?
1 Like