Hi @hjlee9182, welcome to Dask Discourse forum!
As indicated here for schema
kwarg:
Global schema to use for the output dataset. Defaults to “infer”, which will infer the schema from the dask dataframe metadata. This is usually sufficient for common schemas, but notably will fail for
object
dtype columns that contain things other than strings. These columns will require an explicit schema be specified.
So you need to specify a schema
in to_parquet
. I’m no pyarrow expert, but I’ve been able to make it work with:
df.to_parquet('/tmp/arrayparquet', engine='pyarrow', schema={"float_array_column": pa.list_(pa.float64())})