pandas.read_csv(index_col=False) with dask ? index problem

maximemerat · November 16, 2022, 2:47pm

Hi everyone,

I try to load a csv file with dask but I have a problem of index.

With pandas.read_csv() we can pass this param ‘index_col=False’ to fixe my problem.
Exemple below

But if I try to load the same file with dask, I have this error “ValueError: Keywords ‘index’ and ‘index_col’ not supported. Use dd.read_csv(…).set_index(‘my-index’) instead”

When I remove the parameter ‘index_col=False’ to my dask.dataframe.read_csv() function, my dataframe looks like this :
Exemple below

As we can see, the first column of my csv file is the index of my dataframe.
And if I try a reset_index(), my columns names doesn’t match with my columns values:
Exemple below

Do you know how to fix this problem with dask ?

Obviously, I try to find a method that solves the problem as soon as the file is opened and not go through a column rename.

Thank you all for your answers

Topic		Replies	Views
Dask gives KeyError with read_csv Dask DataFrame	8	3524	March 20, 2023
Why does dd.DataFrame say do not use this directly? Dask DataFrame	1	903	June 15, 2023
Dask read sql - Index column requirements Dask DataFrame	6	162	July 29, 2024
Dataframe indexes Dask DataFrame	3	859	June 16, 2022
Creating a new dask df using columns from 2 dataframes and keeping the index of the first Dask DataFrame dask-array , merge	15	109	July 31, 2024

pandas.read_csv(index_col=False) with dask ? index problem

Related topics