question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

User-defined retries

See original GitHub issue

Briefest of discussions with Jose. NOTE: All naming up in the air.

Enable a runtime attribute such as retryOnStderrPattern that populates a value retryAttempt/retry/retryCount/retry_count/etc. This will enable tasks such as:

task mytask {
  command {
     mycommand.sh
  }
  runtime {
    retryOnStderrPattern = "(OutOfMemoryError|disk quota exceeded)"
    memory = (6 * retryAttempt) + "GB"
    disk = "local-disk " + (100 * retryAttempt) + " SSD"
    docker = "myrepo/myimage"
  }
}

When the stderr contains the specified regular expression pattern, the job should be retried with the counter incremented.

Not discussed afaik, how to limit the number of attempts: another runtime attribute, a backend config value, both, other?

Issue Analytics

  • State:open
  • Created 7 years ago
  • Reactions:2
  • Comments:35 (10 by maintainers)

github_iconTop GitHub Comments

2reactions
geoffjentrycommented, Jul 20, 2017

FWIW this has been discussed as a key feature for a WDL push next quarter

1reaction
droazencommented, May 24, 2017

Could I propose that this be re-prioritized? It would help us deal with transient GCS hiccups in production (eg., connections suddenly getting closed, etc.). Individual tools in the GATK and Picard can’t possibly catch every exception across every library involved, so an execution-framework-level retry at the job level would help enormously.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Retrying event-driven functions - Google Cloud
This document describes how to enable retrying for event-driven functions. Automatic retrying is not available for HTTP functions.
Read more >
Working with Retry Properties - Oracle Help Center
About Retry Properties. When a response from a network element is received that is mapped to a RETRY user-defined exit type (UDET), the...
Read more >
Advanced Client-side Transaction Retries | CockroachDB Docs
Advanced client-side transaction retry features for library authors. ... Retrying transactions using these statements has the following benefits:.
Read more >
Configuring Rebalance Retries | Couchbase Docs
This verifies that rebalance retry has been disabled, the required period between retries changed to 100 seconds, and the maximum number of retries...
Read more >
User-defined exceptions handled incorrectly with retries, may ...
When the retries option is non-zero, an exception thrown on the server results in the client incorrectly throwing a NoServersAvailable ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found