Hi, I’m facing some issues with Dask when using code that uses the signal
module to add handlers. It seems that there is no way to have Dask with a single process per worker, even if I set nworkers=N
and nthreads=1
there is always a single thread per process and when code using signal
is used, the module triggers then an error saying that signal only works in main thread
. So the main question is: is it possible to have just processes (without spawning threads) in Dask workers ?
Without knowing more about what you are trying to do with signal
it’s hard to know how to help. But one thing you might want to try is to disable the nanny.
Thanks @jacobtomlinson ! The signal
is used by a dependency so I don’t have control unfortunatelly, it is mainly using signal
to attach some signal handlers, the main issue is that signal
is incompatible with threads and Dask uses threads. What happens when the nanny is disabled ? I tried to do that but then I got into other issues like errors saying that daemonic processes cannot have children, etc. Do you see another alternative ? Thanks again !
The Nanny created the worker in a separate process and then keeps it alive.
I don’t think that signal
is incompatible with threads, it’s just that it has to be used from within the main thread. If you could share more about how you’re creating your Dask workers I should be able to show you how to run things in the main thread.
Thanks @jacobtomlinson, sorry that’s what I meant about being incompatible with threads, not that cannot be used, but just that needs to be used appropriately. My main issue is that I cannot change this external dependency at the moment, so I will try to read a bit more the code to understand how the nannny is starting the processes, as far as I understood each nanny is a separate process and then the nanny starts a worker thread (the nthreads=1
). Thanks again for the help.
I think disabling the Nanny might be the wrong solution here. If you can share a code example that would make things much easier to help you.
I cannot share the code unfortunately, but I will try to understand a bit better about the nanny because it is not clear what is happening under the hood in terms of how these worker processes/threads are being spawned. Thanks.
It doesn’t have to be your exact code. Just an example of how you’re launching Dask, and how you’re using signal
, and the error it produces. Ideally something we can copy/paste to see the issue for ourselves and then provide you with further guidance.