Transient Failures Causing CancellationToken to be cancelled instead of retry after upgrade to 6.0
See original GitHub issueSteps to reproduce
We have recently upgraded our code base from .net core 3.1 to .net 6.0 and as part of that change upgraded EntityFrameworkCore and Pomelo. Additionally, it seems we’ve moved from using MySql.Data.MySqlClient
to MySqlConnector
.
We have an ETL process that runs multiple threads for a job in parallel and calls ExecuteRawSqlAsync
against our database stored procedures. What we started seeing with 6.0 is that we’re getting cancellationToken cancellations and we can’t figure out why. Currently, I’m thinking there was a behavior change with the way transient failures are handled if not retried. It seems as though any transient error we get results in the parent cancellationToken getting cancelled and killing the rest of the DB calls.
We are using .EnableRetryOnFalue
explicitly on the .UseMySql()
optionbuilder, but it seems that ExecuteRawSqlAsync
doesn’t use the current execution strategy. What’s odd is that this worked before in 3.1. Additionally, we’re trying Polly retries, but I think they may be retrying once because it detects the failure, but it won’t succeed on retry because at that point the token is cancelled.
The issue
Basically, we’re seeing transient errors result in tokens being cancelled instead of the connection retried.
Further technical details
We’ve had a handful of our team members digging into this issue as we’re trying to get off of 3.1 before it leaves support.
One thing I’m seeing might have been fixed in MySqlConnector 2.1.13, but it appears our version of Pomelo is still on 2.1.12.
MySQL version: 5.6 Pomelo.EntityFrameworkCore.MySql version: 6.0.2 Microsoft.AspNetCore.App version: 6.0.2
Issue Analytics
- State:
- Created 10 months ago
- Comments:9
Okay, well all the changes paid off, and it looks like we’re in the clear! Thanks again!
We’ve got all of our updates in as well as a change to update the way our cancellationTokens are passed into our background processes. There’s a non-zero chance that that last change was the real kicker. We’re deploying our change to production tonight, as test/stage tested out with flying colors.