Why my managed memory is zero or KB?

Sam · April 2, 2024, 1:49pm

Thanks for the info again, and i didn’t especially make any result group. I read parquet with fixed block size, e.g. 32MiB or 64MiB, and the parquet file also has fixed row group size 64MiB. Since i used dd.read_parquet into Dask dataframe, everything are in Dask collections

Most of my operations on Dask collections are read_parquet/to_parquet, map_partitions, apply, delayed.

That is one reason i wonder why the un-managed memory is that high.

One information I read from some pages is about the python object (structured data type or string) is sort of part of un-managed memory. However, it bother me because it causes the spill function almost not working.

But sometimes some worker looks it has right amount, see third worker from bottom

My speculation is about structured data make partition size unbalance, and i am still finding

Topic		Replies	Views
Memory leak in dask cluster Distributed kubernetes , distributed	5	1979	April 13, 2023
Why I get a lot of unmanaged memory? Distributed	27	4226	February 28, 2023
WARNING - Unmanaged memory use is high Distributed distributed	7	3164	April 3, 2023
Unmanaged memory high even after future collection Distributed	2	254	December 5, 2023
Unable to remove unmanaged memory Distributed kubernetes , future , distributed	8	1167	May 10, 2023

Why my managed memory is zero or KB?

Related topics