Summary
The dask release 2023.2.1
, introduced a new shuffling method called P2P for dask.dataframe
, making sorts, merges, and joins faster and using constant memory. This article describes the problem, the new solution, and the impact on performance