Hi all,
I am running into an odd issue whenever my tasks call a compiled C++ executable (via subprocess
). With the Nanny=True
(default), each task gets pinned to 1 CPU core (not what I want), but as soon as I start workers with Nanny=False
, it works as expected and spreads code across all cores. Ideally, I’d want Nanny=True
for all its benefits in terms of worker restart.
Minimal Code:
def run():
script = textwrap.dedent(
f"""
bash -l <<-'HEREDOC'
conda activate my-kernel
/code/compiledCppCodeHere
HEREDOC
"""
)
process = subprocess.Popen(
script, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE
)
stdout, stderr = process.communicate()
return stdout
f = [client.submit(run, pure=False) for i in range(0, 2)]
client.gather(f)
CPU profile with Nanny=True
(What I dont expect)
CPU Profile with Nanny=False
(what I expect)
Digging a bit deeper, I see that Nanny spawns itself via multiprocessing
module. Not sure if this is the issue?
So a few questions:
- How can I keep
Nanny=True
and allow subprocess to run as expected? - If not, is there an alternative method (different than
multiprocessing
) that I can configure the Nanny workers? Not sure if that is the problem to be honest.
Any pointers or background as to this would be much appreciated! Thanks!