question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Executor submit and wait interfaces are asymmetrical

See original GitHub issue

Current behavior

The API docs and the comments in the code (src/prefect/engine/executors/base.py) suggest (to me) that submit() returns a Future and that the arguments subsequently provided to wait() will also be Future objects. This makes perfect sense; however, the default FlowRunner class treats Executor.submit() and Executor.wait() asymmetrically, specifically:

  • submit() is called with a function to execute and is expected to return “a future-like object”
  • wait() is called with the futures arg set to a Dict[Task, Future].

This is a bit unexpected and not intuitive.

Proposed behavior

Since the FlowRunner class is the one calling submit(), it therefore has access to the returned Future in full context and should perform the mapping from the Task to it’s Future object. For each final_task it should be calling executor.wait() with the Tasks Future object returned from submit(), not the Dict[Task, Future].

The current implementation seems to complicate the Executor interface since the executor doesn’t really need to know about tasks or states, just how to manage execution of a function.

Example

Consider the following example using the concurrent.futures.ThreadPoolExecutor from the python stdlib as what I would expect the behavior to be (supporting code removed for brevity):

import concurrent
import contextlib
import prefect
class MyExecutor(prefect.engine.executors.base.Executor):
    def __init__(self):
        super().__init__()
    @contextlib.contextmanager
    def start(self):
        self.exec = concurrent.futures.ThreadPoolExecutor()
        yield self.exec
    def submit(self, fn, *args, **kwargs):
        return self.exec.submit(fn, *args, **kwargs)
    def wait(self, futures):
        results = []
        for f in futures:
            results.append(f.result()) # block until future returns
        return results

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8

github_iconTop GitHub Comments

1reaction
ghostcommented, Jun 25, 2020

We’re running Prefect locally.

It’s a bit early in our journey to say whether the on_start and on_exit callbacks that you suggest above would work for everything we need to do, but it would be sufficient for what I’m doing now which is basically wrapping the DaskExecutors submit (i.e., on_start) and wait (i.e., on_exit).

0reactions
jcristcommented, Mar 5, 2021

Closing in favor of #4213.

Read more comments on GitHub >

github_iconTop Results From Across the Web

concurrent.futures — Launching parallel tasks — Python 3.11 ...
The concurrent.futures module provides a high-level interface for asynchronously executing callables. The asynchronous execution can be performed with threads, ...
Read more >
Thread Concurrency using ExecutorService in Java 8
Collects Future object for each submitted task. After waiting for 5 seconds, loop through all Future objects and get value returned by each...
Read more >
What is the difference between ExecutorService.submit and ...
The core interface in Java 1.5's Executor framework is the Executor interface which defines the execute(Runnable task) method, whose primary ...
Read more >
Troubleshoot AWS Glue job running for a long time
Uneven distribution of tasks across the executors; Resource under-provisioning. Resolution. Enable metrics. AWS Glue provides Amazon CloudWatch ...
Read more >
Controlling the queue with ThreadPoolExecutor - Javamex
In this case, the connections waiting to be processed will be queued. ... BlockingQueue q = new ArrayBlockingQueue (20); ThreadPoolExecutor ex = new ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found