Enhance split query buffering
See original GitHub issueFile a bug
When enabled, Split Querying is supposed to buffer all result sets except for the last. This does not seem to happen when only one result set is requested. When only one result set is requested, the entire query is still buffered
Include your code
Here is sample code that demonstrates the issue.
using Microsoft.EntityFrameworkCore;
using System.Text;
namespace SplitQueryBug {
public static class Program {
public static void Main() {
var CS = "Data Source=(local);Initial Catalog=__SPLIT_QUERY_BUG;Integrated Security=SSPI;";
//Ensure our DB exists
{
using var Context = MyContext.FromSqlServerConnectionString(CS);
Context.Database.EnsureCreated();
}
//Create our seed text which is 1MB in size.
var SeedText = "";
{
var SB = new StringBuilder();
for (int i = 0; i < 1024 * 1024 * 1; i++) {
SB.Append("X");
}
SeedText = SB.ToString();
}
//Store 32*32 = 1024 records (this would be 1GB of data)
for (int i = 0; i < 32; i++) {
Console.WriteLine($@"Seeing batch {i}...");
using var Context = MyContext.FromSqlServerConnectionString(CS);
for (int j = 0; j < 32; j++) {
var Data = new ContextData() {
Text = SeedText,
};
Context.Add(Data);
}
Context.SaveChanges();
}
//Run our query.
//You'll notice that memory usage spikes when this happens and there is a giant
//delay while all rows are fetched
{
using var Context = MyContext.FromSqlServerConnectionString(CS);
var Query = Context.Data
.AsNoTracking()
.AsEnumerable()
;
foreach (var item in Query) {
}
}
}
}
public class MyContext : DbContext {
public DbSet<ContextData> Data => Set<ContextData>();
public MyContext(DbContextOptions<MyContext> Options) : base(Options) {
}
public static MyContext FromSqlServerConnectionString(string ConnectionString) {
var Options = new DbContextOptionsBuilder<MyContext>()
.UseSqlServer(ConnectionString, x => x.UseQuerySplittingBehavior(QuerySplittingBehavior.SplitQuery))
.Options
;
var ret = new MyContext(Options);
return ret;
}
}
public class ContextData {
public long Id { get; set; }
public string Text { get; set; } = string.Empty;
}
}
I have observed that If QuerySplitting is not active, whether it is disabled globally or disabled on the single query, the results correctly return in an unbuffered manner, however, if it is enabled at all, the entire query seems to be buffered.
Include stack traces
This is confirmed through two methods:
-
Memory usage explodes until the table is fully loaded:
-
If I pause the app when trying to get the first result from the enumerable, I get a stack trace like this which seems to indicate buffering:
Include provider and version information
EF Core version: 6.x Database provider: Microsoft.EntityFrameworkCore.SqlServer Target framework: .NET 6.0 RC1 Operating system: Win10 IDE: VS 2022
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (6 by maintainers)
Aside from that, https://github.com/dotnet/efcore/issues/26129#issuecomment-924867825 seems to repeat the information already given.
Our
IsBuffering
flag is not sufficient. For cases like retrying execution strategy we need to buffer results no matter what. But in case of split query we can avoid buffering last reader. So 2 possible solutions here are adding another flag on QCC to differentiate the reason for buffering or we add true buffered data reader.