Fatal error in gRPC coroutines stack caused by "An error occurred while pushing a context"
See original GitHub issueToday we had a very strange fatal crash in one of our server instances (hosted in ECS/Fargate). We had to restart the instance, otherwise it kept crashing…
Our stack is Armeria 1.13.1 / kotlin 1.5.31; kotlin coroutines 1.5.2; gRPC kotlin 1.2.0; gRPC java 1.41.0;. The gRPC service is based on kotlin coroutines with armeria blocking task executor.
Seems that the initial exception that was recorded in logs, was “An error occurred while pushing a context”:
java.lang.IllegalStateException: Trying to call object wrapped with context [sreqId=e4462b39, chanId=72864b63, raddr=10.204.22.170:49513, laddr=10.204.21.148:8081][h2c://ip-10-204-21-148.ap-southeast-2.compute.internal/<service and operation name that failed>#POST], but context is currently set to [sreqId=5f01fe01, chanId=966cfccd, raddr=10.204.22.170:54644, laddr=10.204.21.148:8081][h1c://ip-10-204-21-148.ap-southeast-2.compute.internal/<full service and operation name>t#POST]. This means the callback was called from unexpected thread or forgetting to close previous context.\
at com.linecorp.armeria.internal.common.RequestContextUtil.newIllegalContextPushingException(RequestContextUtil.java:139)\
at com.linecorp.armeria.server.ServiceRequestContext.push(ServiceRequestContext.java:234)\
at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:546)\
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)\
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)\
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)\
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)\
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)\
at java.base/java.lang.Thread.run(Unknown Source)\
After that, we had hundreds of coroutine-related exceptions - all caused by the same wrong context (sreqId=5f01fe01):
CoroutinesInternalError
Fatal exception in coroutines machinery for DispatchedContinuation[com.linecorp.armeria.common.util.DefaultBlockingTaskExecutor@34e9f91f, Continuation at io.grpc.kotlin.ServerCalls$serverCallListener$1.invokeSuspend(ServerCalls.kt)@61c5d0c6]. Please read KDoc to 'handleFatalException' method and report this incident to maintainers
....
IllegalStateException
Trying to call object wrapped with context [sreqId=067cd880, chanId=217d479a, raddr=10.204.21.9:22674, laddr=10.204.21.148:8081][h2c://ip-10-204-21-148.ap-southeast-2.compute.internal/<service and operation name>#POST], but context is currently set to [sreqId=5f01fe01, chanId=966cfccd, raddr=10.204.22.170:54644, laddr=10.204.21.148:8081][h1c://ip-10-204-21-148.ap-southeast-2.compute.internal/<the same original service and operation name>#POST]. This means the callback was called from unexpected thread or forgetting to close previous context.
com.linecorp.armeria.internal.common.RequestContextUtil in newIllegalContextPushingException at line 139
com.linecorp.armeria.server.ServiceRequestContext in push at line 234
com.linecorp.armeria.server.grpc.ArmeriaRequestCoroutineContext in updateThreadContext at line 46
com.linecorp.armeria.server.grpc.ArmeriaRequestCoroutineContext in updateThreadContext at line 30
kotlinx.coroutines.internal.ThreadContextKt$updateState$1 in invoke at line 54
kotlinx.coroutines.internal.ThreadContextKt$updateState$1 in invoke at line 52
kotlin.coroutines.CoroutineContext$Element$DefaultImpls in fold at line 70
kotlin.coroutines.AbstractCoroutineContextElement in fold at line 15
kotlin.coroutines.CombinedContext in fold at line 131
2
kotlinx.coroutines.internal.ThreadContextKt in updateThreadContext at line 72
kotlinx.coroutines.DispatchedTask in run at line 224
java.util.concurrent.Executors$RunnableAdapter in call
java.util.concurrent.FutureTask in run
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask in run
java.util.concurrent.ThreadPoolExecutor in runWorker
java.util.concurrent.ThreadPoolExecutor$Worker in run
io.netty.util.concurrent.FastThreadLocalRunnable in run at line 30
java.lang.Thread in run
Issue Analytics
- State:
- Created 2 years ago
- Reactions:2
- Comments:16 (8 by maintainers)
Top Results From Across the Web
Add contextual information to last-ditch exception handler #3153
Currently, the very last exception handler in coroutines is Thread.currentThread.defaultUncaughtExceptionHandler().uncaughtException. While ...
Read more >Coroutine exceptions handling | Kotlin
Normally, the handler is used to log the exception, show some kind of error message, terminate, and/or restart the application.
Read more >Segmentation fault when use lua coroutine api and lua_close
You are overflowing the Lua stack by continuously generating new Lua threads and leaving its Lua objects on stack.
Read more >Management & Monitoring - Micronaut Documentation
1 responds with an error when an authenticated user visits a sensitive endpoint. This forces the developer to define how they want their...
Read more >grpc - Go Packages
Call Close on the ClientConn. Cancel the context provided. Call RecvMsg until a non-nil error is returned. A protobuf-generated client-streaming RPC, for ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Thanks, @kwazii1231 for the thorough analysis. I agree that https://github.com/Kotlin/kotlinx.coroutines/issues/2930 caused the problem and it is fixed by https://github.com/Kotlin/kotlinx.coroutines/releases/tag/1.6.2.
after trying to reproduce this issue with @ikhoon i found very mandatory condition of this. if project use coroutine version below, this issue can be reproduce implementation(“org.jetbrains.kotlinx:kotlinx-coroutines-core-jvm:1.5.x”)
in my case, my project use spring version 2.6.3 and gradle plugin for managing dependency plugin automatically downloaded “org.jetbrains.kotlinx:kotlinx-coroutines-core-jvm:1.5.2” after forced change version to 1.6.4 or upgrade spring boot version to 2.7.x it didn’t happen anymore.
kotlin 1.6.1 release note contain fix of this issue
Rollback of time-related functions dispatching on Dispatchers.Main. This behavior was introduced in 1.6.0 and then found inconvenient and erroneous (https://github.com/Kotlin/kotlinx.coroutines/issues/3106, https://github.com/Kotlin/kotlinx.coroutines/issues/3113). Reworked the newly-introduced CopyableThreadContextElement to solve issues uncovered after the initial release (https://github.com/Kotlin/kotlinx.coroutines/pull/3227). Fixed a bug with ThreadLocalElement not being properly updated in racy scenarios (https://github.com/Kotlin/kotlinx.coroutines/issues/2930). Reverted eager loading of default CoroutineExceptionHandler that triggered ANR on some devices (https://github.com/Kotlin/kotlinx.coroutines/issues/3180). New API to convert a CoroutineDispatcher to a Rx scheduler (https://github.com/Kotlin/kotlinx.coroutines/issues/968, https://github.com/Kotlin/kotlinx.coroutines/issues/548). Thanks @recheej! Fixed a memory leak with the very last element emitted from flow builder being retained in memory (https://github.com/Kotlin/kotlinx.coroutines/issues/3197). Fixed a bug with limitedParallelism on K/N with new memory model throwing ClassCastException (https://github.com/Kotlin/kotlinx.coroutines/issues/3223). CoroutineContext is added to the exception printed to the default CoroutineExceptionHandler to improve debuggability (https://github.com/Kotlin/kotlinx.coroutines/issues/3153). Static memory consumption of Dispatchers.Default was significantly reduced (https://github.com/Kotlin/kotlinx.coroutines/pull/3137). Updated slf4j version in kotlinx-coroutines-slf4j from 1.7.25 to 1.7.32.
maybe leak is related with one of above issues i think this one is related https://github.com/Kotlin/kotlinx.coroutines/issues/2930