Dask loc not working : Cant able to use assign = operator with it

for col1 in columns_1:
for col2 in columns_2:
df.loc[df['any_column_in_df'] == col2, col1] = 0

What I want : I want alternative Code/Way to get this done in dask ! working in pandas.
Problem : Can’t use assign ( = ) in dask while using df.loc because of inplace is not support ?
Explanation : I want to assign 0/value where condition meet and return dataframe ! ( not series ! )
I Tried using mask, map_partitions with df.replace (working fine for this simple 1 column value manipulation and returning dataframe as required)…

def replace(x: pd.DataFrame) -> pd.DataFrame:
return x.replace(
{'any_column_to_replace_value': [np.nan]},
{'any_column_to_replace_value': [0]}
)
df = df.map_partitions(replace)

How to do it for first code ? and return dataframe.

Thanks in advance, Please help me Dask Experts i’m new to dask and exploring it…

Answer by @martindurant on gitter…

This is a row-wise compute, so you can use apply or map_partitions

def process(df):
    for col1 in columns_1:
        for col2 in columns_2:
            df.loc[df['any_column_in_df'] == col2, col1] = 0
    return df

df2 = df.map_partitions(process)

Thank you for following up with the solution @TheSunilVarma!

1 Like