I have a bag of tasks defined as
(dask.bag .from_iterable(records) .map(process_record))
FargateCluster. The cluster is spinning up but my tasks seem to be getting killed, I think because of memory issues. A single
process_record can take up to 5GB of RAM, and if a worker runs more than 3 jobs, there will be no RAM available.
So, how do I limit the number of concurrent tasks sent to a single machine? Ideally I would like to run a single task per machine.
Here is the error message I’m getting:
distributed.scheduler.KilledWorker: ("('from_sequence-e3db06f3f9ad9e24eafbfa935bf4a7c9', 2)", <WorkerState 'tcp://172.31.40.119:41411', name: 0, status: closed, memory: 0, processing: 196>) distributed.deploy.adaptive_core - INFO - Adaptive stop Task exception was never retrieved future: <Task finished name='Task-534' coro=<_wrap_awaitable() done, defined at /usr/lib/python3.9/asyncio/tasks.py:685> exception=HTTPClientError('An HTTP Client raised an unhandled exception: cannot schedule new futures after shutdown')>