Dask execution performed only the first time

I have a Jupyter notebook where I want to show that I am using Dask to compute a demanding function.
However, it seems that my data is kept in the cache. In fact, after the first time I run the cell, the successive times I don’t see any process running in the dashboard.

Here is my code:

import dask 
....

# Load a computational expensive function used in Plot API (GDAL based)
def extract_data_at_polygon(ncfile, polygon):
    # set CRS to cut over shapefile
    ds = ncfile.rio.write_crs("EPSG:4326")
    # create the geometry
    coords = polygon.split("((")[-1].split("))")[0].split(",")
    coords_array = []
    for cc in coords:
        point = cc.split(" ")
        point_to_float = [float(i) for i in point]
        coords_array.append(point_to_float)
    # Extract the envelope coordinates
    envelope_coords = [
        [min(p[0] for p in coords_array), min(p[1] for p in coords_array)],
        [max(p[0] for p in coords_array), min(p[1] for p in coords_array)],
        [max(p[0] for p in coords_array), max(p[1] for p in coords_array)],
        [min(p[0] for p in coords_array), max(p[1] for p in coords_array)],
    ]
    envelope = {
        'type': 'Polygon',
        'coordinates': [envelope_coords]
    }
    # Clip using the envelope
    data = ds.rio.clip([envelope], ds.rio.crs, all_touched=True, from_disk=False)
    return data

client = Client(dashboard_address=8088)
client.dashboard_link

ds = xr.open_dataset(era5_file, engine='netcdf4', chunks={'lat': 'auto', 'lon': 'auto', 'time': 'auto', 'level': 1})
era5_polygon = extract_data_at_polygon(ds, polygon)

Even though I don’t perform the computing, I get results regularly.

To be able to see the execution in the dashboard, I have to add era5_polygon.compute() but then the execution takes 29 seconds instead of 0.04 seconds without using era5_polygon.compute()

Any explanation? Am I doing something wrong?

Thanks in advance

Hi @TheRed86, welcome to Dask Discourse forum,

I’m not sure I understand it all: which is the part of the code you are executing several time?

What are the results you get without calling compute()?

I’m wondering wether this is just the lazy nature of Dask you are observing, or if this has to do with pure functions not being recomputed if already done.