Index does not exist on the expected division

I have a dask dataframe whose divisions are (1, 5923, 11845). I want to fetch the row with index 8851. When I call df.get_partition(1).compute().loc[8851], I get an error saying the key doesn’t exist. When I call df.get_partition(0).compute().loc[8851], the item is there. Considering 8851 is a number between 5923 and 11845, I would expect it to exist in the second partition.

Can someone please explain why it isn’t? Are divisions and partitions separate concepts? I’m using dask 2023.5.0

Hi @qherm, welcome to Dask community!

I believe divisions and partitions should be aligned in all cases, else this isn’t very useful.

I just tried to reproduce your issue to no avail:

import pandas as pd
import dask.dataframe as dd

df = pd.DataFrame(dict(a=list('a'*1000+'b'*2000), b=list(range(3000))),

ddf = dd.from_pandas(df, npartitions=3)


Could you come up with a reproducer?

How does it looks like when you call df.get_partition(1).compute()? Which index values do you have?
How did you built this dataframe?