question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Retrying tasks wait to retry in the TaskRunner, blocking other tasks from running

See original GitHub issue

Current behavior

Let’s say we have a task like this (pseudo code below):

from prefect import task, Flow
from prefect.engine.executors import LocalDaskExecutor
from prefect.environments import LocalEnvironment


@task(max_retries=3, retry_delay=timedelta(minutes=5))
def get_url_page(url_page: str):
    response = requests.get(url_page)
    response.raise_for_status()
    return response

If we execute this Flow like so :

with Flow('Example') as flow:
    all_url_pages = ['link_to_page1', 'link_to_page2', 'link_to_page3', ....]
    url_page_results = get_url_page.map(all_url_pages)

flow.environment = LocalEnvironment(
        labels=[], executor=LocalDaskExecutor(scheduler="threads", num_workers=num_workers),
    )

Then if one of the requests fails and the task waits for 5 min to be retried, none of the other URLs mapped to this task are executed. At the same time, the worker is idle.

Proposed behavior

Ideally I believe when a single job in the mapped task fails, while waiting for the retry, the worker should move to the next job.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:14 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
madkinszcommented, Oct 16, 2020

This makes a lot of sense! I’m not sure off the top of my head why it’s behaving that way but I’ll try to dig into it soon.

1reaction
newskoolercommented, Oct 19, 2020

Do you still need code to reproduce?

I still think there is a lot of added value. Sometime when requesting multiple (say hundreds or thousands or more) URL, if one fails (e.g. the endpoint in the URL is not yet available and will be in some minutes or hours later), it makes to keep requesting the rest of the URLs in the meantime; or at least have the option to specify such behaviour (this will actually be ideal) and keep current behaviour as default for example.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to take action upon 3 unsuccessful task runs and do ...
But, if it still fails I want to kick off another task so the rest of the flow can ... TaskRunner | Task...
Read more >
How to Retry Failed Tasks in the ThreadPoolExecutor in Python
You can retry failed tasks in the ThreadPoolExecutor by re-submitting them once they fail. In this tutorial, you will discover how to retry...
Read more >
c# - Retry a task multiple times based on user input in case of ...
private static async Task<T> Retry<T>(Func<T> func, int retryCount) { try { var result = await Task.Run(func); return result; } ...
Read more >
Task Runner Threads and Preconditions - AWS Data Pipeline
In many cases, decreasing the precondition polling timeout and number of retries helps to improve the performance of your application. Similarly, applications ...
Read more >
Chromium Docs - Threading and Tasks in Chrome - FAQ
On which thread will a task run? Does release of a TaskRunner block on posted tasks? Making blocking calls (which do not use...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found