Culling old-nodes from the task graph

In my case I have an iterative algorithm where each iteration uses the output from the previous one. Thus the Futures from the older iterations are not deleted till the end of the algorithm.

If you do not forget the tasks you’ll likely need more memory on the scheduler but otherwise you should be fine. The issue with large graphs often stems from the tasks being very small s.t. the overhead is becoming a problem. You can expect something like 50ms-100ms overhead per task so the runtime of a task should be large enough to compensate this overhead.

The tasks each run a few 10s of seconds. So I hope this is OK for now.

More generally the idea of task graph manipulation is also of interest to me due to Resume capabilities from a persistent cache
I have been trying to create a mechanism to modify the task graph so that the computation is not redone from the first task in a 100 task chain after a node failure, but is instead fetched from the cache.
My thinking here is to modify the graph to create a load-from-cache task as the source after cutting out the older tasks from the chain.

One way to achieve this behavior is to basically restart the algorithm from a cached point periodically, but this would necessitate reexecuting data-fetch which can be incredibly expensive in my case. So if there is a mechanism to create a task-affinity-per-node to existing nodes, so that I can restart the pipeline and not download the data from remote sources, but fetch it from local memory/disk it would be helpful.

Would implementing task-affinity be a more easy to accomplish ?


NOT A CONTRIBUTION