Hello guys,
Searching for some kind of “validate” parameter on dask dataframe merge, I´ve found " dask.dataframe.multi.merge" documentation.
My question is how to access such method? I´ve been trying to use its example, but dask just returns " ‘DataFrame’ object has no attribute ‘multi’" (df.multi.merge(df2)
)
I feel a little lost on dask documentation.
Hi @frbelotto,
What are you trying to validate, the same as provided in the docstring?
Using dataframe.multi.merge would be this way:
from dask.dataframe.multi import merge
However, I’m afraid that the docstring is copied from Pandas but does not reflect the real signature of the method: Dask doesn’t provide any validate
kwarg.
On pandas I usually use the ‘validate’ parameter on a merge do check of any inconsistency or not expected duplicated values. On dask merge there isnt such parameter.
but searching for an workaround I´ve found multi merge. I really couldn’t understand it, but it shows the validate parameter. The examples shown on its documentation are about the merge method, not multi merge.
The multi.merge documentation comes from Pandas. You can use this method with the code I wrote above. However, the docstring is not correct, there is no validate
kwarg available with Dask.