Dask data sharding

scharlottej13 · January 13, 2022, 9:26pm

Hi @vigneshn1997, this seems like a follow-up to your previous post, thanks for bringing it up again! Dask has a number of ways to read and process data in parallel, can you provide more information on what kind of data you’re using and where it is stored? In the meantime, you may find this example is similar to your workflow and there’s also this reference for connecting to remote data.

Topic		Replies	Views
Efficienty shard dask array and send to workers	7	581	December 27, 2021
Dividing data among workers and downloading data local to a worker Dask DataFrame	3	434	February 11, 2022
Best way to persist different datasets in scaling workers Distributed kubernetes , distributed	3	58	April 3, 2025
Dask saving dataframe partitions as files Dask DataFrame distributed	1	542	May 25, 2022
Issue in Parallel row preprocessing with Dask Dask DataFrame kubernetes , distributed	2	534	August 6, 2022

Dask data sharding

Related topics