Dask Dashboard limited worker occupancy

Hi,
I am working on a data pipeline processing which involves enriching the data (pandas dataframe processing, flair embeddings, categorisation prediction , ner prediction etc.) I am wrapping the python service class under the dask delayed however the performance is not upto the mark.
I can see most of the workers are being 100% utilised mos of the time however the Occupancy section is utilised only for 3 workers out of 10, I am not sure what I am doing wrong .
Something similar was happening when I was using 20 workers due to which all 10 workers are working in parallel(each workers are processing user data in a batch of 10 users) where the user data is independent of each other.

Hi @hjain371,

Could you elaborate a bit on this?

When looking at the Dashboard, like on the task stream, are you seeing gaps between rectangles? Do you feel that some workers are idle?

I’m not sure I understand what you mean. Only 10 workers out of 20 were actually doing work?

Did you have a look at your tasks graph, or at how many chunks of data you were generating at first?