question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Scheduler cannot recover from database connection error

See original GitHub issue

Version: 3.0.7

Expected behavior

If a job is scheduled using a simple repeating trigger then a temporary loss of a database connection should be recoverable.

Actual behavior

Jobs scheduled on that scheduler does never recover and start working normally again.

Steps to reproduce

  1. Setup a database and configure the quartz standard scheduler factory to use it. In my example I’m using PostgreSQL.
  2. Create a scheduler and schedule a job with a simple trigger that runs every 10 seconds.
  3. While you know that the job is not executing, restart the PostgreSQL database service.
  4. See that the job configured with a simple trigger that should run every ten seconds no longer runs, even if you wait for 5 minutes.

The example uses .NET Core 2.2, NpgSQL 4.0.7 and Quartz 3.0.7

using Quartz;
using Quartz.Impl;
using System;
using System.IO;
using System.Threading.Tasks;

namespace QuartzConnectionLossExample
{
	class Program
	{
		// Please don't target this against a database that you are using for other stuff.
		private static readonly string _connectionString = "Host=localhost;Port=5432;Database=quartztest;UserName=postgres;Password=postgres";

		/// <summary>
		/// Path to the script that sets up quartz database tables. Set to null if you don't want to clear the public schema and create quartz database objects.
		/// </summary>
		private static readonly string _quartzSQLScriptPath = "Sql/quartzSetupScript.sql";
		private static readonly bool dontSetup = false;

		private static readonly System.Collections.Specialized.NameValueCollection quartzConfig =
			new System.Collections.Specialized.NameValueCollection()
			{
				{ "quartz.serializer.type", "json" },
				{ "quartz.jobStore.type", "Quartz.Impl.AdoJobStore.JobStoreTX, Quartz" },
				{ "quartz.jobStore.driverDelegateType"," Quartz.Impl.AdoJobStore.PostgreSQLDelegate, Quartz" },
				{ "quartz.jobStore.tablePrefix", "QRTZ_" },
				{ "quartz.jobStore.dataSource", "PostgreSql" },
				{ "quartz.dataSource.PostgreSql.connectionString", _connectionString },
				{ "quartz.dataSource.PostgreSql.provider", "Npgsql" },
				{ "quartz.plugin.triggHistory.type", "Quartz.Plugin.History.LoggingJobHistoryPlugin, Quartz.Plugins" },
				{  "quartz.threadPool.threadCount", "2" }
			};

		private static readonly StdSchedulerFactory _schedulerFactory = new StdSchedulerFactory(quartzConfig);

		static void Main(string[] args)
		{
			Console.WriteLine("Example start.");

			if (!dontSetup)
			{
				SetupDatabase();
			}

			var scheduler = _schedulerFactory.GetScheduler().GetAwaiter().GetResult();

			SetupJobAndSimpleTrigger(scheduler);

			Console.WriteLine("Scheduler started, feel free to restart your database service and see if it can recover from that.");
			Console.ReadLine();
		}

		private static void SetupJobAndSimpleTrigger(IScheduler scheduler)
		{
			IJobDetail job = null;

			if (!scheduler.CheckExists(new JobKey("TestJob", "TestGroup")).GetAwaiter().GetResult())
			{
				job = JobBuilder.Create<TestJob>()
					.WithIdentity(new JobKey("TestJob", "TestGroup"))
					.WithDescription("Test.")
					.RequestRecovery(true)
					.StoreDurably(true)
					.Build();
			}
			else
			{
				job = scheduler.GetJobDetail(new JobKey("TestJob", "TestGroup")).GetAwaiter().GetResult();
			}

			if (!scheduler.CheckExists(new TriggerKey("TestTrigger", "TestGroup")).GetAwaiter().GetResult() && job != null)
			{
				var trigger = TriggerBuilder.Create()
					.WithIdentity(new TriggerKey("TestTrigger", "TestGroup"))
					.WithDescription("Runs every 10 seconds")
					.WithSimpleSchedule(x => x
						.WithIntervalInSeconds(10)
						.RepeatForever()
						.WithMisfireHandlingInstructionIgnoreMisfires()
					)
					.StartNow()
					.EndAt(null)
					.Build();

				scheduler.ScheduleJob(job, trigger).GetAwaiter().GetResult();
			}

			scheduler.Start().GetAwaiter().GetResult();
		}

		static void SetupDatabase()
		{
			#region Clean public schema and setup quartz database objects.
			if (!string.IsNullOrWhiteSpace(_quartzSQLScriptPath))
			{
				Console.WriteLine("Setting up database");

				using (var connection = new Npgsql.NpgsqlConnection(_connectionString))
				{
					connection.Open();

					using (var command = connection.CreateCommand())
					{
						command.CommandText =
$@"
DROP SCHEMA ""public"" CASCADE;
CREATE SCHEMA ""public"";
" + File.ReadAllText(_quartzSQLScriptPath);

						using (var transaction = connection.BeginTransaction())
						{
							command.Transaction = transaction;
							command.ExecuteNonQuery();
							transaction.Commit();
						}
					}
				}

				Console.WriteLine("Database setup completed");
			}
			else
			{
				Console.WriteLine("No setup script specified, setup skipped.");
			}
			#endregion
		}
	}

	// Once you hit the breakpoint below you should hit continue immediately, and then immediately try to restart the database.
	// Basically it will only fail if the database connection error occurs before the Execute method is called.
	[DisallowConcurrentExecution]
	public class TestJob : IJob
	{
		public Task Execute(IJobExecutionContext context)
		{
			System.Diagnostics.Debugger.Break();
			return Task.Delay(1000);
		}
	}
}

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:23 (12 by maintainers)

github_iconTop GitHub Comments

1reaction
Squall90commented, Dec 10, 2019

I am having the exact same issue with MS SQL. ERROR Quartz.Impl.AdoJobStore.ConnectionAndTransactionHolder (null) Staging - Couldn’t rollback ADO.NET connection. Transaction not connected, or was disconnected System.InvalidOperationException: Transaction not connected, or was disconnected at Quartz.Impl.AdoJobStore.ConnectionAndTransactionHolder.CheckNotZombied() at Quartz.Impl.AdoJobStore.ConnectionAndTransactionHolder.Rollback(Boolean transientError)

Is there a workaround? The issues is that ALL of the jobs are halted until the service is restarted.

1reaction
ghostcommented, Dec 10, 2019

I’m having this issue also, using MS SQL.

2019-12-06 13:18:50,276 [91] ERROR Quartz.Impl.AdoJobStore.ConnectionAndTransactionHolder (null) Staging - Couldn’t rollback ADO.NET connection. Transaction not connected, or was disconnected System.InvalidOperationException: Transaction not connected, or was disconnected at Quartz.Impl.AdoJobStore.ConnectionAndTransactionHolder.CheckNotZombied() at Quartz.Impl.AdoJobStore.ConnectionAndTransactionHolder.Rollback(Boolean transientError)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Retry Quartz Scheduler start up on database connection ...
The quartz configuration uses JDBC job store. If the database connection is down (due to intermittent network failure) at the time of calling ......
Read more >
1095016 – Quartz stops working if database connection is ...
Description of problem: If connection to the database is temporarily interrupted, Quartz stops working and no future jobs are scheduled.
Read more >
Quartz Scheduler unable to recover from a database outage
After a database outage the scheduler is not able to revive and throws jdbc connection exception. I have provided a stack trace below....
Read more >
JIRA Services stop working due to a database network failure
Cause. JIRA services related database processes are unable to recover from the connection loss due to a network issue that occurred. Workaround.
Read more >
Resolving the 'DB Connection Invalidated' Scheduler Error
The “DB Connection Invalidated” error typically occurs when Apache Airflow loses its connection to the database. This can happen due to a ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found