Dask created a datetimeindex and I cannot assign it back to the source ddf

@dennisd Thanks for the details!

I was able to reproduce this, and looks like it’s because you’re calling pandas to_datetime, and assigning it to a Dask DataFrame. You’ll need to use Dask DataFrame’s API here:

from datetime import datetime

import pandas as pd
import dask.dataframe as dd

df = pd.DataFrame({'date3': ['1232021', '1332021', '1432021', '1532021', None]})
ddf = dd.from_pandas(df, npartitions=2)

ddf['date3'] = dd.to_datetime(ddf['date3'], format="%d%m%Y")
ddf.compute()

I believe you wouldn’t need your step-wise workaround after this. That said, just to clarify, you’re getting the TypeError because you may have floats/NaNs in your DataFrame, and datetime.strptime only accepts strings. So, you may need to clean your dataset before converting it to datetime.

Let me know if this helps!

1 Like