Queries with MultipleActiveResultSets=True (MARS) are very slow / time out on Linux
See original GitHub issueDescribe the bug
TL;DR:
Queries using connections with MARS enabled, even when they don’t use MARS, are much slower or even time out on Linux. The same queries are fast and reliable on Windows no matter whether MARS is disabled or enabled and on Linux when MARS is disabled.
Context Octopus Cloud hosts Octopus Deploy instances in Linux containers on Azure AKS with data stored in Azure Files and Azure SQL. A couple of months ago we noticed that some of the SQL queries were much slower or even started timing out which is not something we’ve experienced before on Windows using Full .NET Framework. Some of the slowdown might be caused by AKS (K8s) but we think that the SqlClient might also be playing a role here. 119112824000676 is our Azure Support Request if that helps in any way.
Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
---> System.ComponentModel.Win32Exception (258): Unknown error 258
at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
at Microsoft.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
at Microsoft.Data.SqlClient.TdsParser.TdsExecuteTransactionManagerRequest(Byte[] buffer, TransactionManagerRequestType request, String transactionName, TransactionManagerIsolationLevel isoLevel, Int32 timeout, SqlInternalTransaction transaction, TdsParserStateObject stateObj, Boolean isDelegateControlRequest)
at Microsoft.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransactionYukon(TransactionRequest transactionRequest, String transactionName, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest)
at Microsoft.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransaction(TransactionRequest transactionRequest, String name, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest)
at Microsoft.Data.SqlClient.SqlInternalConnection.BeginSqlTransaction(IsolationLevel iso, String transactionName, Boolean shouldReconnect)
at Microsoft.Data.SqlClient.SqlConnection.BeginTransaction(IsolationLevel iso, String transactionName)
at Microsoft.Data.SqlClient.SqlConnection.BeginTransaction(IsolationLevel iso)
at reprocli.Program.Scenario4(String connString, Int32 number)
at reprocli.Program.<>c__DisplayClass0_0.<Main>b__0(Int32 n)
at System.Linq.Parallel.ForAllOperator`1.ForAllEnumerator`1.MoveNext(TInput& currentElement, Int32& currentKey)
at System.Linq.Parallel.ForAllSpoolingTask`2.SpoolingWork()
at System.Linq.Parallel.SpoolingTaskBase.Work()
at System.Linq.Parallel.QueryTask.BaseWork(Object unused)
at System.Linq.Parallel.QueryTask.<>c.<.cctor>b__10_0(Object o)
at System.Threading.Tasks.Task.InnerInvoke()
at System.Threading.Tasks.Task.<>c.<.cctor>b__274_0(Object obj)
at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
ClientConnectionId:005d2aae-9409-4711-aaa0-b03b70f2832e
Error Number:-2,State:0,Class:11
ClientConnectionId before routing:e3300799-fdd0-40a4-84ea-b9f383596b12
Routing Destination:fed2c41af7dc.tr5.westus2-a.worker.database.windows.net,11063<---
We also captured TCP dumps while running the tests on Linux and it looks like enabling MARS causes TCP RST.


Full TCP Dumps: https://github.com/benPearce1/k8s-sql-timeout-repro/tree/tiny/source/reprocli/tcpdumps
To reproduce
Code
Repo with the sample app: https://github.com/benPearce1/k8s-sql-timeout-repro/blob/tiny/source/reprocli/Program.cs. Compiled folder contains pre-compiled versions of the app so .NET Core SDK doesn’t have to be present on the target VMs.
The first parameter is the level of parallelism. The second parameter is the connection string.
using System;
using System.Data;
using System.Diagnostics;
using System.Linq;
using Microsoft.Data.SqlClient;
namespace reprocli
{
class Program
{
static void Main(string[] args)
{
try
{
var count = int.Parse(args[0]);
var connectionString = args[1];
var total = Stopwatch.StartNew();
PrepareData(connectionString);
total.Restart();
Enumerable.Range(0,count)
.AsParallel()
.WithDegreeOfParallelism(count)
.ForAll(n => Scenario4(connectionString, n));
Console.WriteLine($"Total: {total.Elapsed}");
}
catch (Exception e)
{
Console.WriteLine(e);
throw;
}
}
private static void Scenario4(string connString, int number)
{
var userStopWatch = Stopwatch.StartNew();
var buffer = new object[100];
for (var i = 0; i < 210; i++)
{
var queryStopWatch = Stopwatch.StartNew();
using (var connection = new SqlConnection(connString))
{
connection.Open();
using (var transaction = connection.BeginTransaction(IsolationLevel.ReadCommitted))
{
using (var command = new SqlCommand("SELECT * From TestTable", connection, transaction))
{
using (var reader = command.ExecuteReader())
{
while (reader.Read())
{
reader.GetValues(buffer);
}
}
}
transaction.Commit();
}
}
queryStopWatch.Stop();
Console.WriteLine($"Number: {number}. Query: {i} Time: {queryStopWatch.Elapsed}");
}
userStopWatch.Stop();
Console.WriteLine($"Number: {number}. All Queries. Time: {userStopWatch.Elapsed}");
}
static void PrepareData(string connectionString)
{
var createTable = @"
DROP TABLE IF EXISTS TestTable;
CREATE TABLE TestTable
(
[Id] [nvarchar](50) NOT NULL PRIMARY KEY,
[Name] [nvarchar](20) NOT NULL
);";
using (var connection = new SqlConnection(connectionString))
{
connection.Open();
using (var transaction = connection.BeginTransaction(IsolationLevel.ReadCommitted))
{
using (var command = new SqlCommand(createTable, connection, transaction))
{
command.ExecuteNonQuery();
}
transaction.Commit();
}
}
}
}
}
This is how we reproduced the problem which doesn’t mean you need this exact config.
The database was hosted in an Azure SQL Elastic Pool (Standard: 300 eDTUs) on a SQL Server in West US 2 region.
LINUX
Run the sample app with the following arguments on a Linux (ubuntu 18.04) VM (Standard D8s v3 (8 vcpus, 32 GiB memory) in Azure West US 2 region.
MARS ON
dotnet reprocli.dll 200 'Server=tcp:YOURSERVER.database.windows.net,1433;Initial Catalog=TestDatabase;Persist Security Info=False;User ID=YOURUSER;Password=YOURPASSWORD;MultipleActiveResultSets=True;'
The expected result is that the app finishes without throwing any errors but that’s not the case and Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. is thrown.
Reducing the level of parallelism to 20 stops the app from crashing.
Also, when MARS is ON the console doesn’t show any progress for 10+ seconds. This is not the case when MARS is OFF.
MARS OFF
dotnet reprocli.dll 200 'Server=tcp:YOURSERVER.database.windows.net,1433;Initial Catalog=TestDatabase;Persist Security Info=False;User ID=YOURUSER;Password=YOURPASSWORD;MultipleActiveResultSets=False;'
The expected result is that the app finishes without throwing any errors which is the case. The app finished in just under 25 seconds. Total: 00:00:24.9737616. The app also worked with much higher levels of parallelism (e.g. 500)
AKS
Same spec as above: Linux (ubuntu 18.04) VM (Standard D8s v3 (8 vcpus, 32 GiB memory) in Azure West US 2. We also ran this test in a container in AKS and the results were pretty much the same. The only difference was that we had to lower the level of parallelism even more. K8s networking adds a bit of overhead which might make the problem more pronounced.
WINDOWS
Run the sample app with the following arguments on a Windows (Windows Server 2016 Datacenter) VM (Standard D8s v3 (8 vcpus, 32 GiB memory) in Azure West US 2 region.
dotnet reprocli.dll 200 'Server=tcp:YOURSERVER.database.windows.net,1433;Initial Catalog=TestDatabase;Persist Security Info=False;User ID=YOURUSER;Password=YOURPASSWORD;MultipleActiveResultSets=True;'
The expected result is that the app finishes without throwing an exception which is the case. The app finished in just under 24 seconds. Total: 00:00:23.4068641. It also worked with level of parallelism set to 500. We achieved similar results with MARS disabled.
Note: We used .NET Core to run tests in Windows.
Expected behavior
The sample app should not crash and connections with MARS feature enabled should behave in the same way on both Linux and Windows.
Further technical details
Microsoft.Data.SqlClient version: 1.1.0 and 2.0.0-preview1.20021.1 .NET target: (Core 2.2 and Core 3.1) SQL Server version: (Azure SQL) Operating system: (Ubuntu 18.04 and AKS with Ubuntu 18.4)
Additional context We’ve been battling this issue for a long time now so we are happy to help in any way we can to get it resolved.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:36
- Comments:98 (42 by maintainers)

Top Related StackOverflow Question
Good that this github issue exists (thanks!), we seem to have run into the same issue. Problem only appears when running the (asp core + ef core 3.1.2) app on Docker with Kubernetes with MARS on. Our background service handling lots of data would simply “die”, sometimes with and sometimes without any exception thrown. As it is a BackgroundService/IHostedService, the web app continues to run, just the BackgroundService is gone.
I turned MARS off and now it works.
I got two kinds of exceptions, this one with default settings of DbContext.
When setting the command timeout to five minutes, I got this exception - same as the opener of this issue.
This issue caused lots of working days of diagnosing, as there is no clear indication what is wrong, hindering troubleshooting.
This bit us big time. Setting
MultipleActiveResultSets=truecaused lots of timeouts when running .net core app on linux pod on K8s. Removing it from connection string made the app very very fast and responsive and the “Connection Timeout Expired” errors are all gone.