Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Executor submit and wait interfaces are asymmetrical

See original GitHub issue

Current behavior

The API docs and the comments in the code (src/prefect/engine/executors/base.py) suggest (to me) that submit() returns a Future and that the arguments subsequently provided to wait() will also be Future objects. This makes perfect sense; however, the default FlowRunner class treats Executor.submit() and Executor.wait() asymmetrically, specifically:

submit() is called with a function to execute and is expected to return “a future-like object”
wait() is called with the futures arg set to a Dict[Task, Future].

This is a bit unexpected and not intuitive.

Proposed behavior

Since the FlowRunner class is the one calling submit(), it therefore has access to the returned Future in full context and should perform the mapping from the Task to it’s Future object. For each final_task it should be calling executor.wait() with the Tasks Future object returned from submit(), not the Dict[Task, Future].

The current implementation seems to complicate the Executor interface since the executor doesn’t really need to know about tasks or states, just how to manage execution of a function.

Example

Consider the following example using the concurrent.futures.ThreadPoolExecutor from the python stdlib as what I would expect the behavior to be (supporting code removed for brevity):

import concurrent
import contextlib
import prefect
class MyExecutor(prefect.engine.executors.base.Executor):
    def __init__(self):
        super().__init__()
    @contextlib.contextmanager
    def start(self):
        self.exec = concurrent.futures.ThreadPoolExecutor()
        yield self.exec
    def submit(self, fn, *args, **kwargs):
        return self.exec.submit(fn, *args, **kwargs)
    def wait(self, futures):
        results = []
        for f in futures:
            results.append(f.result()) # block until future returns
        return results

Issue Analytics

State:
Created 3 years ago
Comments:8

Top GitHub Comments

1reaction

ghostcommented, Jun 25, 2020

We’re running Prefect locally.

It’s a bit early in our journey to say whether the on_start and on_exit callbacks that you suggest above would work for everything we need to do, but it would be sufficient for what I’m doing now which is basically wrapping the DaskExecutors submit (i.e., on_start) and wait (i.e., on_exit).

0reactions

jcristcommented, Mar 5, 2021

Closing in favor of #4213.

Top Results From Across the Web

concurrent.futures — Launching parallel tasks — Python 3.11 ...

The concurrent.futures module provides a high-level interface for asynchronously executing callables. The asynchronous execution can be performed with threads, ...

Thread Concurrency using ExecutorService in Java 8

Collects Future object for each submitted task. After waiting for 5 seconds, loop through all Future objects and get value returned by each...

What is the difference between ExecutorService.submit and ...

The core interface in Java 1.5's Executor framework is the Executor interface which defines the execute(Runnable task) method, whose primary ...

Troubleshoot AWS Glue job running for a long time

Uneven distribution of tasks across the executors; Resource under-provisioning. Resolution. Enable metrics. AWS Glue provides Amazon CloudWatch ...

Controlling the queue with ThreadPoolExecutor - Javamex

In this case, the connections waiting to be processed will be queued. ... BlockingQueue q = new ArrayBlockingQueue (20); ThreadPoolExecutor ex = new ......