Hi, I am trying to benchmark the performance of an HPC Cluster using Dask’s SLURMCluster. Thanks to the help I previously received from you, I managed to make it scale well on my cluster.
There is one issue that remained unsolved. I realize that if I initiliaze a scheduler and I ask it to perform multiple times the same task for me, then the first run is going to be several seconds slower.
I am attaching the results from the 3 runs on the same cluster:
Run 1 → Dask Performance Report
Run 2 → Dask Performance Report
Run 3 → Dask Performance Report
I read in the report that the first run took several seconds in “deserialize time”, and I wonder from where this extra time is added. If I read the Task Stream, I can confirm that only in the first run, there is a “deserialize-dask_mapper”.
What I have tried was to import ROOT
as an external module, by adding relative paths as:
client.upload_file('../../root/root_rdfenv-ucx-2/lib/DistRDF/Backends/Dask/Backend.py')
client.upload_file('../../root/root_rdfenv-ucx-2/lib/ROOT/__init__.py')
I again produced 3 reports for each run:
Run 1 → Dask Performance Report
Run 2 → Dask Performance Report
Run 3 → Dask Performance Report
For runs 2,3 reports are as before. But I see that in report 1, there is no time labeled as “deserialize time”, nor there is a “deserialize-dask_mapper” in the task stream. And again the first run took few seconds longer than the other runs.
Question 1: I would like to ask you what might be the cause of the slower first runs.
Question 2: What is the transfer time in the summary of the report? I see that it is minimal for the first run, which seems unreasonable to me.