Dask team releases version 2022.03.0

This release includes contributions from 28 people, including 7 new contributors! You can see the complete list of change in the Changelog. Highlights include:

  1. Dask Bag now supports out-of-memory sampling using reservoir sampling, thanks to Daniel Mesejo-León

  2. There have been a number of improvements to documentation. There is a new new page documenting the interactive Dask dashboard thanks to @ncclementi

  3. Other doc improvements include better API docs of string, categorical, and datetime accessors for DataFrames, and updated explanations of how Dask’s distributed scheduler assigns tasks to workers. Thanks to @scharlottej13, @jcristharif, keewis, and @gjoseph92!

  4. The Dask distributed scheduler can now dump its cluster state to arbitrary fsspec-compatible filesystems (like S3 or GCS), allowing for easier debugging.

  5. In Dask DataFrame, you can now pass a Dask Index to set_index(), thanks to @phobson

  6. Finally, Dask has deprecated reading bcolz tables, as the Blosc Development Team is no longer able to maintain the bcolz package, thanks to @pavithraes for adding the deprecation warning

Thanks to @jsignell for managing this release, and to all Dask contributors who had a hand in it!

2 Likes