Hi All,
I have a computation that involves the same length of the large arrays with a length of 5175148. From what I read in the Best Practices doc saying that …
Because of it, I wrote my code as follows.
def search_neighbors(hexagons_all, h3str):
for h3str in h3strs:
L1_neighbors = list(h3.hex_ring(h3str,1))
L0_ij = h3.experimental_h3_to_local_ij(origin=h3str, h=h3str)
L1_ij = np.array([h3.experimental_h3_to_local_ij(origin=h3str, h=L1)
for L1 in L1_neighbors]) - L0_ij
L1_neighbors = [ h3.string_to_h3(a) for a in L1_neighbors ]
idx = np.lexsort((L1_ij[:,1], L1_ij[:,0]))
return np.searchsorted(hexagons_all0, L1_neighbors)[idx]
r, size = 40, len(h3strs_all0)
with MeasureTime(' search_neighbors & slope & aspect'):
#----- test 01
neighbors = []
for i in range(r):
begin = int(i*size/r)
end = int((i+1)*size/r)
neighbors.append([
delayed(search_neighbors)(hexagons_all, delayed(h3str))
for h3str in h3strs_all[begin:end]
])
neighbors = compute(*neighbors)
slope, aspect = compute_slope_aspect(zs_all, h3strs, neighbors.T, search_radius)
notice that the h3strs_all
and hexagons_all
are all as delayed variables
However, if I do this, it gives me an error saying the Dask delayed object of unspecified length not iterable error. If I make h3strs_all0 = h3strs_all.compute()
and take h3strs_all0
instead of h3strs_all
, I am able to run the code, but the process is not speed up. The CPU usage is reaching up to 100%, which is way slower than If I don’t use function to run it, which is
with MeasureTime(' search_neighbors & slope & aspect, no dask'):
neighbors = []
for h3str in h3strs_all:
L1_neighbors = list(h3.hex_ring(h3str,1))
L0_ij = h3.experimental_h3_to_local_ij(origin=h3str, h=h3str)
L1_ij = np.array([h3.experimental_h3_to_local_ij(origin=h3str, h=L1)
for L1 in L1_neighbors]) - L0_ij
L1_neighbors = [ h3.string_to_h3(a) for a in L1_neighbors ]
idx = np.lexsort((L1_ij[:,1], L1_ij[:,0]))
neighbors(np.searchsorted(hexagons_all0, L1_neighbors)[idx])
neighbors = np.array(neighbors)
slope, aspect = compute_slope_aspect(zs_all, h3strs, neighbors.T, search_radius)
Could someone help me to optimize this code, please?
Thanks
cyhsu