[BUG] Unhandled exception in ProcessManager when checking process.HasExited
See original GitHub issueLibrary name and version
Microsoft.Azure.Services.AppAuthentication 1.0.3
Describe the bug
We use AzureTokenServiceProvider for managed identity in a Azure K8s environment. We observed some exception call stack like below. It seems in the ProcessManager, when checking if (!process.HasExited)
, it didn’t catch the exception, and eventually can crash the host process.
This happens infrequently. Per Process.HasExited documentation, the InvalidOperationException can be thrown. So I would expect the caller to catch that.
Unhandled exception. System.AggregateException: One or more errors occurred. (No process is associated with this object.)
---> System.InvalidOperationException: No process is associated with this object.
at System.Diagnostics.Process.EnsureState(State state)
at System.Diagnostics.Process.get_HasExited()
at Microsoft.Azure.Services.AppAuthentication.ProcessManager.<>c__DisplayClass2_0.<ExecuteAsync>b__3()
at System.Threading.CancellationToken.<>c.<.cctor>b__26_0(Object obj)
at System.Threading.CancellationTokenSource.CallbackNode.<>c.<ExecuteCallback>b__9_0(Object s)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
at System.Threading.CancellationTokenSource.CallbackNode.ExecuteCallback()
at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)
--- End of inner exception stack trace ---
at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)
at System.Threading.CancellationTokenSource.NotifyCancellation(Boolean throwOnFirstException)
at System.Threading.CancellationTokenSource.<>c.<.cctor>b__56_0(Object obj)
at System.Threading.TimerQueueTimer.CallCallback(Boolean isThreadPool)
at System.Threading.TimerQueueTimer.Fire(Boolean isThreadPool)
at System.Threading.TimerQueue.FireNextTimers()
Expected behavior
There is no unhandled exception throwing.
Actual behavior
There is an unhandled exception which caused the process to crash.
Reproduction Steps
We don’t have a reproduction steps. This is caught from the production logs when our code consuming this package running in Azure Kubernetes environment.
The environment is our POD consuming the package to get managed identity token based on POD identity framework in Azure Kubernetes cluster. We will use the token to visit our backend Azure SQL server.
Environment
Azure Kubernetes environment based on Ubuntu 1804, dotnet core 3.1
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (5 by maintainers)
Top GitHub Comments
@shuoshen2014 We’re currently working on a design spec for managed identity token caching in the Azure Identity client libraries. Development on this feature is expected to kick off in the next semester (tentatively, the April/May timeframe).
Yes @shuoshen2014 , this is a valid scenario for a bug fix to add “try catch exception” logic around the the
process.HasExcited
call if it still reproduces in 1.6.2. Feel free to create a PR request against the repo!Tagging @schaabs and @scottaddie to help/take note on Azure.Identity ask for token caching