Hello Here,
I am a maintainer of Apache Airflow and we reached (Again) a point that we start to discuss whether to remove DaskExecutor support from Apache Airflow.
I just started a discussion here https://lists.apache.org/thread/6stgcpjt5jb3xfw92oo1j486j33c8v7m - and the problem is that Dask Executor implementation in Airlfow uses old libraries (including old distributed
library that are holding us back as dependency (for example we cannot upgrade to ARM M1 compatible dependencies and Python 3.10). When we try to upgrade the depedencies our tests fail:
- partially because (apparently) newer
distributed
library does not containtest
folder which breaks some of our tests - partially because there are are some errors that we aren’t really able to diagnose as no-one in our team uses dask
I would like to know if:
- anyone here uses Dask and Airflow and expects it to work in the future
- someone from Dask team could help with solving the tests
For now I think the course of action we will take (my proposal) is to disable all Dask tests in development version and remove all the limitations tha Dask introduces. We are preparing for 2.3.0 release of Airflow that we will turn DaskExecutor into “unsupported” or “untested” version for Airlfow (and give our user the information on what limitations they need to apply) but this will disable those users from using some other features of Airflow in parallel (for example they won’t be able to use it in Python 3.10 as well as use Google integration - because we are adding some newer features that conflict with the current Dask support.
I would love to hear your thoughts about it.