Implement task timeouts
See original GitHub issueTimeouts are an important feature for task runs. There are some difficulties in implementing this correctly though. Here is a quick and easy implementation inside of a task runner:
import signal
def timeout_handler(signum, frame):
raise TimeoutError("Execution timed out.")
@contextmanager
def timeout(seconds=None):
if seconds:
try:
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(round(seconds))
yield
finally:
signal.alarm(0)
else:
yield
...
with timeout(self.timeout):
state = self.get_run_state(state=state, inputs=task_inputs)
this works great for the LocalExecutor
and SynchronousExecutor
, but fails for DaskExecutor
because signal
cannot be called from outside the main thread.
Other possibilities:
- using
async
to submittask.run
with a timeout, but this would likely break ourpython 3.4
compatibility - sending
task.run
to a new process usingmultiprocessing.Process
; this is probably the route we will eventually go but note: prints to standard out (which we rely upon for lots of tutorials) will break. Moreover, retrieving the return value from thetask.run
method is non-trivial and requires additional communication overhead
Issue Analytics
- State:
- Created 5 years ago
- Comments:7 (7 by maintainers)
Top Results From Across the Web
Asynchronously wait for Task<T> to complete with timeout
I can use Task.ContinueWith to asynchronously wait for the task to complete (i.e. schedule an action to be executed when the task is...
Read more >On awaiting a task with a timeout in C# - The Old New Thing
Say you have an awaitable object, and you want to await it, but with a timeout. How would you build that? What you...
Read more >How to timeout a task AND make sure it ends - Siderite's Blog
You go to StackOverflow, of course, and find this answer: Asynchronously wait for Task<T> to complete with timeout. It's an elegant solution, ...
Read more >Set task timeout (jobs) | Cloud Run Documentation
View task timeout settings · Go to Cloud Run jobs · Click the job you are interested in to open the Job details...
Read more >Use timeouts to avoid stuck executions - AWS Step Functions
Without an explicit timeout, Step Functions often relies solely on a response from an activity worker to know that a task is complete....
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
It’s very nice to be able to use (for example)
LocalExecutor
for debugging, which also implies working in the same process. That’s an argument for two different implementations (or possibly adding the multiprocessing solution onBaseExecutor
and then overriding just forLocalExecutor
.Sorry somewhere in the mix I got ahead of myself; I don’t think there’s actually a solution in the local case when
DaskExecutor(processes=True)
. In that case,signal
doesn’t work because it isn’t run in the main thread, and the multiprocessing spawning doesn’t work because you can’t spawn a process from within a daemonic subprocess. If the dask cluster was actually distributed (and not local), and the workers had theno-nanny
flag set, then the multiprocessing solution should work.I might go ahead and submit a PR that works for all executors other than that one and we can refactor as necessary.