Summary: I submit a list to be modified to a thread-based client → the concrete list is not modified. Why?
For process-based clients, I understand that concrete objects are serialized, sent to workers, and reconstructed on the worker side. The reconstructed objects are entirely new objects.
For thread-based clients, I thought that concrete objects were shared with workers without serialization, and therefore the concrete objects and the worker objects were the same objects.
The snippet below is proving me wrong for thread-based clients with local distributed (single machine): appending to a remote list does not append to the concrete list.
Could someone explain why or point to documentation I missed?
For comparison, appending to the list passed to another thread with a ThreadPoolExecutor does append to the concrete list, see snippet below.
(I also thought that these shared memory considerations only concerned process-based clients)
from concurrent.futures import ThreadPoolExecutor
from distributed import Client
def append(_list: list, value: str) -> None:
_list.append(value)
if __name__ == "__main__":
my_list = []
with ThreadPoolExecutor() as thread_executor:
thread_executor.submit(append, _list=my_list, value="thread_executor").result()
with Client(processes=False) as client:
client.submit(append, _list=my_list, value="thread_client").result()
print(f"{my_list=}")
Output:
my_list=['thread_executor']
I was expecting
my_list=['thread_executor', 'thread_client']