Method 'acquire' of '_thread.lock' taking 90% of time

I am not using distributed client. first I load pandas dataframe to dask dataframe then do regex check on every column. I have multiple regex, so doing this operation multiple times and storing these futures into the dict to compute all this in the end.
here is the code.

futures_dict: Dict[str, Dict[str, float]] = {}
    for matcher in regex_list:
        column_name_and_mean: Dict[str, float] = {}
        for column_name in table_df.columns:
            column_name_and_mean[column_name] = table_df[column_name].str.match(matcher.regex).mean()
        futures_dict[] = column_name_and_mean


this is the flame graph from cprofile.

So, most of my time is used up in ‘acquire’ of ‘_thread.lock’. Can someone help me with this problem?
Thank you!

1 Like

Hi @rvarunrathod, welcome to Dask community!

It would really help if you could provide a complete reproducer of your workflow, could you do that?

What happens, what is the performance if you just loop and do your computation using just Pandas, is it faster?

What happens if you try to use a distributed setup, with a LocalCluster?

I’m not entirely sure of what the profile you are getting means, maybe you are not getting the information from all the threads?

This is entirely expected: snakeviz/profile only looks at what your main thread is doing, but all of the compute work is happening in other threads, so it just shows “wait”. This is one of the reasons to use distributed even just locally (LocalCluster), because it gives much better diagnostic information.

There are some tools available for diagnostics without distributed, but you need to opt in to using them - see the docs.