question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Is there a way to add retry policy for transient failures in database operations?

See original GitHub issue

We are running our clustered jobs scheduler in Azure. It is common that a database connection is interrupted from time to time or SQL command fails due to transient network failure. Is there a way to configure a retry policy so that for example, when trigger state is update fails the a retry policy is immediately applied.

I am investigating for a way how to avoid restarting a scheduler.

Thank you!

Quartz.JobPersistenceException: Couldn't update states of blocked triggers: Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.Data.SqlClient.SqlException: Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception: The wait operation timed out
   --- End of inner exception stack trace ---
   at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at System.Data.SqlClient.SqlCommand.InternalEndExecuteNonQuery(IAsyncResult asyncResult, String endMethod, Boolean isInternal)
   at System.Data.SqlClient.SqlCommand.EndExecuteNonQueryInternal(IAsyncResult asyncResult)
   at System.Data.SqlClient.SqlCommand.EndExecuteNonQueryAsync(IAsyncResult asyncResult)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Quartz.Impl.AdoJobStore.StdAdoDelegate.<UpdateTriggerStatesForJobFromOtherState>d__70.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Quartz.Impl.AdoJobStore.JobStoreSupport.<TriggerFired>d__237.MoveNext()
   --- End of inner exception stack trace ---
   at Quartz.Impl.AdoJobStore.JobStoreSupport.<TriggerFired>d__237.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Quartz.Impl.AdoJobStore.JobStoreSupport.<>c__DisplayClass236_0.<<TriggersFired>b__0>d.MoveNext() [See nested exception: System.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out
   at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at System.Data.SqlClient.SqlCommand.InternalEndExecuteNonQuery(IAsyncResult asyncResult, String endMethod, Boolean isInternal)
   at System.Data.SqlClient.SqlCommand.EndExecuteNonQueryInternal(IAsyncResult asyncResult)
   at System.Data.SqlClient.SqlCommand.EndExecuteNonQueryAsync(IAsyncResult asyncResult)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Quartz.Impl.AdoJobStore.StdAdoDelegate.<UpdateTriggerStatesForJobFromOtherState>d__70.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Quartz.Impl.AdoJobStore.JobStoreSupport.<TriggerFired>d__237.MoveNext()

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:1
  • Comments:7 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
lahmacommented, Nov 9, 2019

@puddlewitt you are indeed correct that the logic won’t take action here. I’ve opened an issue on Java side to discuss what is the correct action as there is a logic fault as far as I understand. The retry-logic should come from JobStoreSupport level where transaction is retried on error.

0reactions
puddlewittcommented, Nov 8, 2019

I don’t think TriggerFired is covered by IsTransient because it doesn’t call RollbackConnection. Doesn’t appear to be then caught by ReleaseAcquiredTrigger because the exception type doesn’t match.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Azure SQL Database - Working with transient errors
Retry logic for transient errors. Client programs that occasionally encounter a transient error are more robust when they contain retry logic.
Read more >
Transient fault handling - Best practices for cloud applications
Perform retry operations only when the faults are transient (typically indicated by the nature of the error) and when there's at least some ......
Read more >
How to Fix Transient Errors to Improve Application Resilience
Determine when a fault is likely to be transient or a terminal one. Retry the operation if it determines that the fault is...
Read more >
Transient Fault Handling | Serverless360 Blog
This article is about handling transient failures in Azure. ... Create a retry policy that uses a retry strategy from the configuration.
Read more >
Best practices for retry pattern - harish bhattbhatt - Medium
Understand that operation failed is suitable for retry · Use Exponential back-off for retry · Determine the number of retry attempts and interval ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found