FargateCluster timeout on exit
See original GitHub issueWhen I close a Fargate cluster (similar to #220 ) using
client.close()
cluster.close()
I receive the following error
Traceback (most recent call last):
File "/home/ec2-user/anaconda3/envs/features_r/lib/python3.8/site-packages/distributed/deploy/adaptive_core.py", line 190, in adapt
target = await self.safe_target()
File "/home/ec2-user/anaconda3/envs/features_r/lib/python3.8/site-packages/distributed/deploy/adaptive_core.py", line 128, in safe_target
n = await self.target()
File "/home/ec2-user/anaconda3/envs/features_r/lib/python3.8/site-packages/distributed/deploy/adaptive.py", line 146, in target
return await self.scheduler.adaptive_target(
File "/home/ec2-user/anaconda3/envs/features_r/lib/python3.8/site-packages/distributed/core.py", line 789, in send_recv_from_rpc
comm = await self.live_comm()
File "/home/ec2-user/anaconda3/envs/features_r/lib/python3.8/site-packages/distributed/core.py", line 747, in live_comm
comm = await connect(
File "/home/ec2-user/anaconda3/envs/features_r/lib/python3.8/site-packages/distributed/comm/core.py", line 307, in connect
raise IOError(
OSError: Timed out trying to connect to tcp://3.87.54.191:8786 after 10 s
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ec2-user/anaconda3/envs/features_r/lib/python3.8/site-packages/tornado/ioloop.py", line 741, in _run_callback
ret = callback()
File "/home/ec2-user/anaconda3/envs/features_r/lib/python3.8/site-packages/tornado/ioloop.py", line 765, in _discard_future_result
future.result()
File "/home/ec2-user/anaconda3/envs/features_r/lib/python3.8/site-packages/distributed/deploy/adaptive_core.py", line 204, in adapt
if status != "down":
UnboundLocalError: local variable 'status' referenced before assignment
tornado.application - ERROR - Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <zmq.eventloop.ioloop.ZMQIOLoop object at 0x7fbf8cc7f640>>, <Task finished name='Task-16550' coro=<AdaptiveCore.adapt() done, defined at /home/ec2-user/anaconda3/envs/features_r/lib/python3.8/site-packages/distributed/deploy/adaptive_core.py:178> exception=UnboundLocalError("local variable 'status' referenced before assignment")>)
Traceback (most recent call last):
File "/home/ec2-user/anaconda3/envs/features_r/lib/python3.8/site-packages/distributed/comm/core.py", line 285, in connect
comm = await asyncio.wait_for(
File "/home/ec2-user/anaconda3/envs/features_r/lib/python3.8/asyncio/tasks.py", line 490, in wait_for
raise exceptions.TimeoutError()
asyncio.exceptions.TimeoutError
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:7 (2 by maintainers)
Top Results From Across the Web
FargateCluster timeout on exit · Issue #5447 · dask/distributed
When I close a Fargate cluster (similar to #220 ) using client.close() cluster.close() I receive the following error Traceback (most recent ...
Read more >FargateCluster - AWS Documentation - Amazon.com
Defines an EKS cluster that runs entirely on AWS Fargate. The cluster is created with a default Fargate Profile that matches the “default”...
Read more >Dask Cloud Provider Environment - Prefect Docs
from dask_cloudprovider import FargateCluster from prefect import Flow, ... For development, you may want to increase this timeout. @task def times_two(x): ...
Read more >uvicorn shutting down after 1-2 minutes on AWS Fargate
gunicorn \ --log-config 'logging.conf' --timeout 6000 ... CMD-SHELL curl -f http://0.0.0.0:8000 || exit 1. I always thought it was the other ...
Read more >Traefik + ECS Fargate | Gateway timeout error : r/aws - Reddit
I've been trying to deploy multiple services behind a Traefik proxy inside the ECS Fargate cluster. But Traefik is not been able to...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
https://github.com/PrefectHQ/prefect/issues/5330 - here is a detail of issue which leads to IOLoop also, using prefect and fargate cluster.
Sorry to tag, just wondering if any updates regarding this potential issue in distributed is getting any attention? @jacobtomlinson
I am also having this issue, I have tested with 2021.12.0 and 2022.01.1 and python3.7/8/9.
However, worth noting, I got the same raises through a wrapper (Prefect) using dask-cloudproviders (fargatecluster).
My code runs, results complete, then upon exit of process the IOLoop is closed runtime error is raised in scheduler from utils.py