AttributeError: SeriesGroupBy object has no attribute ffill

Hi, I’m trying to convert some pandas code into Dask.

In pandas, I can easily ffill my SeriesGroupBy object,

toy example:

import numpy as np
import pandas as pd

df = pd.DataFrame({
    'a_cat': list('aabbbaccc'),
    'b_num': [np.nan if i%3!=0 else (i+1) for i in range(9)]
})

df.groupby('a_cat')['b_num'].ffill()

but when I try to do it in Dask

import dask.dataframe as dd

ddf = dd.from_pandas(df)
ddf.groupby('a_cat')['b_num'].ffill()

I get a “AttributeError: SeriesGroupBy object has not attribute ffill”

What would be the best way to achieve the above in Dask.

Thanks :slight_smile:

Hi @CJC-ds and welcome! This is a great question-- you’re correct that unfortunately ffill is not yet implemented for Dask SeriesGroupBy objects. Let me see if I can come up with a workaround, but in the meantime, I would encourage you to submit an issue for feature request!

1 Like

Hi @scharlottej13 thank you! I just made the feature request at the github repo :smiley:

2 Likes

Thanks @CJC-ds! Cross-linking the open issue here Missing Built-in methods for SeriesGroupBy `ffill` · Issue #8708 · dask/dask · GitHub

1 Like