Hi, I’m wondering what are some best practices while using a dask-sql query?
Does having two inner joins significantly slow down the process?
Hi, I’m wondering what are some best practices while using a dask-sql query?
Does having two inner joins significantly slow down the process?
@farmd Welcome! There are some general guidelines in these docs:
Does having two inner joins significantly slow down the process?
It depends, joins are quite expensive in a parallel+distributed setting. However, we may be able to optimize it based on your computation and data. Would you be able to share a minimal, reproducible example? I’ll be happy to look into it further.