How to get associated netcdf names from dask object

Thanks for providing more details. I see what you mean regarding being able to easily access the filename from the delayed object. I think this is hard to access from the high-level to_netcdf function because the name is tokenized, which ensures the keys are unique.

With graph_manipulation.bind, you can create a specific dependency. Here is an example from another discourse question. In your case you could do something like:

import xarray as xr
import numpy as np
import pandas as pd
from dask.graph_manipulation import bind
import dask


@dask.delayed
def special_func():
    pass


# create fake xarray dataset
data = np.random.rand(4, 3)
locs = ["IA", "IL", "IN"]
times = pd.date_range("2000-01-01", periods=4)
foo = xr.DataArray(data, coords=[times, locs], dims=["time", "space"])

# create dependency for some list of files
list_of_files = ['example.nc', 'example1.nc']
delayeds = [
    foo.to_netcdf(filename, compute=False) for filename in list_of_files
]
new_func = bind(special_func(), delayeds)
1 Like