Hi,
I have three datetime columns that were initially uploaded to ddf as string. Then I have to convert them into datetime. I used the following guide from the book âData Science with Python and Daskâ modified according to my date format:
from datetime import datetime
date3_parsed = ddf['date3'].apply(lambda x: datetime.strptime(x, "%d%b%Y"), meta=datetime)
date3_a = ddf.drop('date3', axis=1)
date3_b = date3_a.assign(date3=date3_parsed)
the first two date columns worked as expected. But the third date column is giving me a hard time. When I used the above, it gave me a TypeError:
TypeError: strptime() argument 1 must be str, not float
When I tried the following:
ddf['date3'] = pd.to_datetime(ddf['date3'], format = "%d%b%Y")
the conversion took an hour, then I get the following error:
ValueError: Length of values (13090962) does not match length of index (2)
I broke the steps apart.
- parsed using the pd.to_datetime
- dropped the column
- assign it back
The parsing worked. I got the datetimeindex in this form
DatetimeIndex([2021-12-31', '2021-11-30','NaT', '2022-03-20']).
Although when I pass on the .head(), it says datetimeindex does not have head.
The drop column also worked.
The issue was with the last step, that is when I get the above ValueError.