How to find count of idle workers from scheduler_info?

Unfortunately, whenever I switched to SpecCluster, my CPU utilization dropped to ~40-45%.
I created a worker_spec dict with 3 outer and 12 inner tasks and fed 3 files, also did not specify any scheduler, tried to specify a default one, fed less futures to inner, with no change…

It seems like it is using threads (no setting to force processes?) and/or it is context switching, or just has too much overhead in scheduling.

One interesting thing is: There is a large delay after the audio processing is finished, which was not there in LocalCluster:

Also memory usage in each worker increased considerably, throttling the workers so I had to set it to 0 (although in Task Manager I see 32GB free RAM). This is probably caused from the fact that I have to read the chunk in the outer level and pass it AFTER some pre-processing. When I give 3*12 processes to the inner level, memory usage triples.

Any advice will be very much appreciated.