I am wondering why
repartition(freq='24h') is resulting in round divisions.
Dask dataframe with divisions aligned on 12:00:00
df = dask.datasets.timeseries().compute() df.index += pd.to_timedelta('12:00:00') dd = dask.dataframe.from_pandas(df, npartitions=15) dd
repartition(freq=‘24h’) is resulting in round divisions:
Same happens with
pd.to_timedelta('1d') because Dask
repartition_freq() explicitly ceils the first division, but I am unable to understand why it’s a good idea, and how can I bypass this?
def repartition_freq(df, freq=None): [...] try: start = df.divisions.ceil(freq) except ValueError: start = df.divisions