question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Fatal error in gRPC coroutines stack caused by "An error occurred while pushing a context"

See original GitHub issue

Today we had a very strange fatal crash in one of our server instances (hosted in ECS/Fargate). We had to restart the instance, otherwise it kept crashing…

Our stack is Armeria 1.13.1 / kotlin 1.5.31; kotlin coroutines 1.5.2; gRPC kotlin 1.2.0; gRPC java 1.41.0;. The gRPC service is based on kotlin coroutines with armeria blocking task executor.

Seems that the initial exception that was recorded in logs, was “An error occurred while pushing a context”:

java.lang.IllegalStateException: Trying to call object wrapped with context [sreqId=e4462b39, chanId=72864b63, raddr=10.204.22.170:49513, laddr=10.204.21.148:8081][h2c://ip-10-204-21-148.ap-southeast-2.compute.internal/<service and operation name that failed>#POST], but context is currently set to [sreqId=5f01fe01, chanId=966cfccd, raddr=10.204.22.170:54644, laddr=10.204.21.148:8081][h1c://ip-10-204-21-148.ap-southeast-2.compute.internal/<full service and operation name>t#POST]. This means the callback was called from unexpected thread or forgetting to close previous context.\
at com.linecorp.armeria.internal.common.RequestContextUtil.newIllegalContextPushingException(RequestContextUtil.java:139)\
at com.linecorp.armeria.server.ServiceRequestContext.push(ServiceRequestContext.java:234)\
at com.linecorp.armeria.common.RequestContext.lambda$makeContextAware$3(RequestContext.java:546)\
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)\
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)\
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)\
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)\
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)\
at java.base/java.lang.Thread.run(Unknown Source)\

After that, we had hundreds of coroutine-related exceptions - all caused by the same wrong context (sreqId=5f01fe01):

CoroutinesInternalError
Fatal exception in coroutines machinery for DispatchedContinuation[com.linecorp.armeria.common.util.DefaultBlockingTaskExecutor@34e9f91f, Continuation at io.grpc.kotlin.ServerCalls$serverCallListener$1.invokeSuspend(ServerCalls.kt)@61c5d0c6]. Please read KDoc to 'handleFatalException' method and report this incident to maintainers

....

IllegalStateException
Trying to call object wrapped with context [sreqId=067cd880, chanId=217d479a, raddr=10.204.21.9:22674, laddr=10.204.21.148:8081][h2c://ip-10-204-21-148.ap-southeast-2.compute.internal/<service and operation name>#POST], but context is currently set to [sreqId=5f01fe01, chanId=966cfccd, raddr=10.204.22.170:54644, laddr=10.204.21.148:8081][h1c://ip-10-204-21-148.ap-southeast-2.compute.internal/<the same original service and operation name>#POST]. This means the callback was called from unexpected thread or forgetting to close previous context.

com.linecorp.armeria.internal.common.RequestContextUtil in newIllegalContextPushingException at line 139
com.linecorp.armeria.server.ServiceRequestContext in push at line 234
com.linecorp.armeria.server.grpc.ArmeriaRequestCoroutineContext in updateThreadContext at line 46
com.linecorp.armeria.server.grpc.ArmeriaRequestCoroutineContext in updateThreadContext at line 30
kotlinx.coroutines.internal.ThreadContextKt$updateState$1 in invoke at line 54
kotlinx.coroutines.internal.ThreadContextKt$updateState$1 in invoke at line 52
kotlin.coroutines.CoroutineContext$Element$DefaultImpls in fold at line 70
kotlin.coroutines.AbstractCoroutineContextElement in fold at line 15
kotlin.coroutines.CombinedContext in fold at line 131
2
kotlinx.coroutines.internal.ThreadContextKt in updateThreadContext at line 72
kotlinx.coroutines.DispatchedTask in run at line 224
java.util.concurrent.Executors$RunnableAdapter in call
java.util.concurrent.FutureTask in run
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask in run
java.util.concurrent.ThreadPoolExecutor in runWorker
java.util.concurrent.ThreadPoolExecutor$Worker in run
io.netty.util.concurrent.FastThreadLocalRunnable in run at line 30
java.lang.Thread in run

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:2
  • Comments:16 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
ikhooncommented, Oct 12, 2022

Thanks, @kwazii1231 for the thorough analysis. I agree that https://github.com/Kotlin/kotlinx.coroutines/issues/2930 caused the problem and it is fixed by https://github.com/Kotlin/kotlinx.coroutines/releases/tag/1.6.2.

1reaction
kwazii1231commented, Oct 10, 2022

after trying to reproduce this issue with @ikhoon i found very mandatory condition of this. if project use coroutine version below, this issue can be reproduce implementation(“org.jetbrains.kotlinx:kotlinx-coroutines-core-jvm:1.5.x”)

in my case, my project use spring version 2.6.3 and gradle plugin for managing dependency plugin automatically downloaded “org.jetbrains.kotlinx:kotlinx-coroutines-core-jvm:1.5.2” after forced change version to 1.6.4 or upgrade spring boot version to 2.7.x it didn’t happen anymore.

kotlin 1.6.1 release note contain fix of this issue

Rollback of time-related functions dispatching on Dispatchers.Main. This behavior was introduced in 1.6.0 and then found inconvenient and erroneous (https://github.com/Kotlin/kotlinx.coroutines/issues/3106, https://github.com/Kotlin/kotlinx.coroutines/issues/3113). Reworked the newly-introduced CopyableThreadContextElement to solve issues uncovered after the initial release (https://github.com/Kotlin/kotlinx.coroutines/pull/3227). Fixed a bug with ThreadLocalElement not being properly updated in racy scenarios (https://github.com/Kotlin/kotlinx.coroutines/issues/2930). Reverted eager loading of default CoroutineExceptionHandler that triggered ANR on some devices (https://github.com/Kotlin/kotlinx.coroutines/issues/3180). New API to convert a CoroutineDispatcher to a Rx scheduler (https://github.com/Kotlin/kotlinx.coroutines/issues/968, https://github.com/Kotlin/kotlinx.coroutines/issues/548). Thanks @recheej! Fixed a memory leak with the very last element emitted from flow builder being retained in memory (https://github.com/Kotlin/kotlinx.coroutines/issues/3197). Fixed a bug with limitedParallelism on K/N with new memory model throwing ClassCastException (https://github.com/Kotlin/kotlinx.coroutines/issues/3223). CoroutineContext is added to the exception printed to the default CoroutineExceptionHandler to improve debuggability (https://github.com/Kotlin/kotlinx.coroutines/issues/3153). Static memory consumption of Dispatchers.Default was significantly reduced (https://github.com/Kotlin/kotlinx.coroutines/pull/3137). Updated slf4j version in kotlinx-coroutines-slf4j from 1.7.25 to 1.7.32.

maybe leak is related with one of above issues i think this one is related https://github.com/Kotlin/kotlinx.coroutines/issues/2930

Read more comments on GitHub >

github_iconTop Results From Across the Web

Add contextual information to last-ditch exception handler #3153
Currently, the very last exception handler in coroutines is Thread.currentThread.defaultUncaughtExceptionHandler().uncaughtException. While ...
Read more >
Coroutine exceptions handling | Kotlin
Normally, the handler is used to log the exception, show some kind of error message, terminate, and/or restart the application.
Read more >
Segmentation fault when use lua coroutine api and lua_close
You are overflowing the Lua stack by continuously generating new Lua threads and leaving its Lua objects on stack.
Read more >
Management & Monitoring - Micronaut Documentation
1 responds with an error when an authenticated user visits a sensitive endpoint. This forces the developer to define how they want their...
Read more >
grpc - Go Packages
Call Close on the ClientConn. Cancel the context provided. Call RecvMsg until a non-nil error is returned. A protobuf-generated client-streaming RPC, for ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found