Dask team releases versions 2022.05.1 and 2022.05.2

The Dask team released versions 2022.05.1 and 2022.05.2 this week! These include contributions from 33 people, including 7 new contributors. :tada:

Check out the complete changelog here , some release highlights:

  • We’re glad to introduce two new ways to create Dask DataFrames. Thanks, @rjzamora and Matthew Powers for adding these!
    • from_map() to create DFs from a custom function map, and
    • from_dict() to create DFs from Python dictionaries.
  • The team continued to improve how Dask DataFrame works with Parquet files! Thanks, @rjzamora, @jcristharif, and @ian for leading this work! Note that this effort also involves some breaking API changes, which you can track in the changelog.
  • A frequently referenced documentation page “Creating and Storing DataFrames”, has been updated to reflect current best practices! Thanks, @scharlottej13! Check it out here: Create and Store Dask DataFrames — Dask documentation
  • Thanks to @ddavis, Dask has a new DaskCollection Protocol that defines the interface for a Dask Collection! It can be used with static type checkers and for self-documenting parts of the Dask docs, among other things.
  • Dask Distributed has a new Scheduler HTTP API that exposes some scheduler methods for nicer interactions! Thank you for working on this, Matthew Murray!

Finally, thanks @jrbourbeau for managing the releases. :sparkles:

1 Like