Occasionally Ktor utilizes 100% CPU without any load after failed request.
See original GitHub issueKtor Version and Engine Used (client or server and name) Our application uses both - Ktor server and client. Versions are ktor: 1.2.3 kotlin: 1.3.50 server’s engine: Netty client’s engine: CIO
// http server
implementation "io.ktor:ktor-server-core:$ktor_version"
implementation "io.ktor:ktor-server-netty:$ktor_version"
implementation "io.ktor:ktor-server-host-common:$ktor_version"
implementation "io.ktor:ktor-server-sessions:$ktor_version"
implementation "io.ktor:ktor-auth:$ktor_version"
implementation "io.ktor:ktor-jackson:$ktor_version"
implementation "io.ktor:ktor-auth-jwt:$ktor_version"
implementation "io.ktor:ktor-locations:$ktor_version"
// http client
implementation "io.ktor:ktor-client-core:$ktor_version"
implementation "io.ktor:ktor-client-core-jvm:$ktor_version"
implementation "io.ktor:ktor-client-apache:$ktor_version"
implementation "io.ktor:ktor-client-cio:$ktor_version"
implementation "io.ktor:ktor-client-json:$ktor_version"
implementation "io.ktor:ktor-client-json-jvm:$ktor_version"
implementation "io.ktor:ktor-client-jackson:$ktor_version"
implementation "io.ktor:ktor-client-logging:$ktor_version"
implementation "io.ktor:ktor-client-logging-jvm:$ktor_version"
implementation "io.ktor:ktor-client-auth:$ktor_version"
implementation "io.ktor:ktor-client-auth-jvm:$ktor_version"
Run on a VPC: Linux Debian 9.11
Describe the bug Monitoring reported that our server consumes 100%. The server does nothing (it’s test environment and issue happened later evening) but shows high CPU usage. In jstack, I found only one suspicious thread:
"Thread-8@16328" daemon prio=10 tid=0x6f nid=NA runnable
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPoll.wait(EPoll.java:-1)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
- locked <0x3ff5> (a sun.nio.ch.EPollSelectorImpl)
- locked <0x403d> (a sun.nio.ch.Util$2)
at sun.nio.ch.SelectorImpl.selectNow(SelectorImpl.java:146)
at io.ktor.network.selector.ActorSelectorManager.process(ActorSelectorManager.kt:81)
at io.ktor.network.selector.ActorSelectorManager$process$1.invokeSuspend(ActorSelectorManager.kt:-1)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(Dispatched.kt:241)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.lang.Thread.run(Thread.java:834)
Strace shows that some thread is performing a lot of epolls (all with same fd) and these polls are finished at once:
[pid 5938] 23:07:27.637413 epoll_wait(63, [], 1024, 0) = 0 <0.000014>
[pid 5938] 23:07:27.637473 epoll_wait(63, [], 1024, 0) = 0 <0.000016>
[pid 5938] 23:07:27.637538 epoll_wait(63, [], 1024, 0) = 0 <0.000015>
[pid 5938] 23:07:27.637598 epoll_wait(63, [], 1024, 0) = 0 <0.000043>
[pid 5938] 23:07:27.637687 epoll_wait(63, [], 1024, 0) = 0 <0.000016>
[pid 5938] 23:07:27.637739 epoll_wait(63, [], 1024, 0) = 0 <0.000015>
[pid 5938] 23:07:27.637799 epoll_wait(63, [], 1024, 0) = 0 <0.000015>
[pid 5938] 23:07:27.637860 epoll_wait(63, [], 1024, 0) = 0 <0.000016>
lsof shows:
java 3405 root 63u a_inode 0,13 0 17139 [eventpoll]
To Reproduce According to monitoring system, usage spike happened at same time then an user tried to logout using expired session. In our system it will lead to throwing an exception and handing it with status page:
install(StatusPages) {
...
Exceptions.apply {
httpExceptions(testing)
}
}
@KtorExperimentalAPI
fun StatusPages.Configuration.httpExceptions(testing: Boolean) {
exception<HttpError> {
logger.warn("${this.context.request.uri} - failed due to ${it.message}${it.cause?.let {"(caused by $it)"}}", it)
if (it.code == HttpStatusCode.Unauthorized) {
call.unauthorized(it.body)
} else {
it.body?.let { body ->
call.respond(it.code, body)
} ?: call.respond(it.code)
}
}
...
@KtorExperimentalAPI
suspend fun ApplicationCall.unauthorized(maybeError: HttpErrorBody? = null): Unit {
// set WWW-Authenticate (as per RFC required for 401 status)
val realm = application.environment.config.property("authentication.realm").getString()
val header = HttpAuthHeader.Parameterized(AuthenticationScheme, mapOf(HttpAuthHeader.Parameters.Realm to realm))
response.headers.append(HttpHeaders.WWWAuthenticate, header.toString())
// clear session
sessions.clear<MySessionCookie>()
// return status code and body
maybeError?.let { body ->
respond(HttpStatusCode.Unauthorized, body)
} ?: respond(HttpStatusCode.Unauthorized)
}
In log, we have:
2019-11-08 19:31:58,277 DEBUG [nioEventLoopGroup-4-1][172.26.0.8][REQ-496] auth - session 8670142f-97a2-4035-9f17-e62774e0a7c5 has expired
2019-11-08 19:31:58,277 INFO [nioEventLoopGroup-4-1][172.26.0.8][REQ-496] application - finished POST /web/v1/logout with null
2019-11-08 19:31:58,278 WARN [nioEventLoopGroup-4-1][172.26.0.8][REQ-496] application - /web/v1/logout - failed due to SessionExpired com.example.http.HttpError$UnauthorizedAccess: SessionExpired
at com.example.ApplicationKt$module$12$$special$$inlined$session$lambda$1.invokeSuspend(Application.kt:297)
at com.example.ApplicationKt$module$12$$special$$inlined$session$lambda$1.invoke(Application.kt)
at com.example.ApplicationKt$module$12$$special$$inlined$session$1.invokeSuspend(SessionAuth.kt:156)
at com.example.ApplicationKt$module$12$$special$$inlined$session$1.invoke(SessionAuth.kt)
at io.ktor.util.pipeline.SuspendFunctionGun.loop(PipelineContext.kt:268)
at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(PipelineContext.kt:67)
at io.ktor.util.pipeline.SuspendFunctionGun.proceed(PipelineContext.kt:141)
at io.ktor.util.pipeline.SuspendFunctionGun.execute(PipelineContext.kt:161)
at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:27)
at io.ktor.auth.Authentication.processAuthentication(Authentication.kt:228)
at io.ktor.auth.Authentication$interceptPipeline$2.invokeSuspend(Authentication.kt:123)
at io.ktor.auth.Authentication$interceptPipeline$2.invoke(Authentication.kt)
at io.ktor.util.pipeline.SuspendFunctionGun.loop(PipelineContext.kt:268)
at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(PipelineContext.kt:67)
at io.ktor.util.pipeline.SuspendFunctionGun.proceed(PipelineContext.kt:141)
at io.ktor.util.pipeline.SuspendFunctionGun.execute(PipelineContext.kt:161)
at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:27)
at io.ktor.routing.Routing.executeResult(Routing.kt:147)
at io.ktor.routing.Routing.interceptor(Routing.kt:34)
at io.ktor.routing.Routing$Feature$install$1.invokeSuspend(Routing.kt:99)
at io.ktor.routing.Routing$Feature$install$1.invoke(Routing.kt)
at io.ktor.util.pipeline.SuspendFunctionGun.loop(PipelineContext.kt:268)
at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(PipelineContext.kt:67)
at io.ktor.util.pipeline.SuspendFunctionGun.proceed(PipelineContext.kt:141)
at io.ktor.features.ContentNegotiation$Feature$install$1.invokeSuspend(ContentNegotiation.kt:106)
at io.ktor.features.ContentNegotiation$Feature$install$1.invoke(ContentNegotiation.kt)
at io.ktor.util.pipeline.SuspendFunctionGun.loop(PipelineContext.kt:268)
at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(PipelineContext.kt:67)
at io.ktor.util.pipeline.SuspendFunctionGun.proceed(PipelineContext.kt:141)
at io.ktor.features.StatusPages$interceptCall$2.invokeSuspend(StatusPages.kt:98)
at io.ktor.features.StatusPages$interceptCall$2.invoke(StatusPages.kt)
at kotlinx.coroutines.intrinsics.UndispatchedKt.startUndispatchedOrReturn(Undispatched.kt:91)
at kotlinx.coroutines.CoroutineScopeKt.coroutineScope(CoroutineScope.kt:180)
at io.ktor.features.StatusPages.interceptCall(StatusPages.kt:97)
at io.ktor.features.StatusPages$Feature$install$2.invokeSuspend(StatusPages.kt:137)
at io.ktor.features.StatusPages$Feature$install$2.invoke(StatusPages.kt)
at io.ktor.util.pipeline.SuspendFunctionGun.loop(PipelineContext.kt:268)
at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(PipelineContext.kt:67)
at io.ktor.util.pipeline.SuspendFunctionGun.proceed(PipelineContext.kt:141)
at io.ktor.features.CallLogging$Feature$install$1$invokeSuspend$$inlined$withMDC$1.invokeSuspend(CallLogging.kt:226)
at io.ktor.features.CallLogging$Feature$install$1$invokeSuspend$$inlined$withMDC$1.invoke(CallLogging.kt)
at kotlinx.coroutines.intrinsics.UndispatchedKt.startUndispatchedOrReturn(Undispatched.kt:91)
at kotlinx.coroutines.BuildersKt__Builders_commonKt.withContext(Builders.common.kt:156)
at kotlinx.coroutines.BuildersKt.withContext(Unknown Source)
at io.ktor.features.CallLogging$Feature$install$1.invokeSuspend(CallLogging.kt:230)
at io.ktor.features.CallLogging$Feature$install$1.invoke(CallLogging.kt)
at io.ktor.util.pipeline.SuspendFunctionGun.loop(PipelineContext.kt:268)
at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(PipelineContext.kt:67)
at io.ktor.util.pipeline.SuspendFunctionGun.proceed(PipelineContext.kt:141)
at io.ktor.util.pipeline.SuspendFunctionGun.execute(PipelineContext.kt:161)
at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:27)
at io.ktor.server.engine.DefaultEnginePipelineKt$defaultEnginePipeline$2.invokeSuspend(DefaultEnginePipeline.kt:118)
at io.ktor.server.engine.DefaultEnginePipelineKt$defaultEnginePipeline$2.invoke(DefaultEnginePipeline.kt)
at io.ktor.util.pipeline.SuspendFunctionGun.loop(PipelineContext.kt:268)
at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(PipelineContext.kt:67)
at io.ktor.util.pipeline.SuspendFunctionGun.proceed(PipelineContext.kt:141)
at io.ktor.util.pipeline.SuspendFunctionGun.execute(PipelineContext.kt:161)
at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:27)
at io.ktor.server.netty.NettyApplicationCallHandler$handleRequest$1.invokeSuspend(NettyApplicationCallHandler.kt:36)
at io.ktor.server.netty.NettyApplicationCallHandler$handleRequest$1.invoke(NettyApplicationCallHandler.kt)
at kotlinx.coroutines.intrinsics.UndispatchedKt.startCoroutineUndispatched(Undispatched.kt:55)
at kotlinx.coroutines.CoroutineStart.invoke(CoroutineStart.kt:111)
at kotlinx.coroutines.AbstractCoroutine.start(AbstractCoroutine.kt:154)
at kotlinx.coroutines.BuildersKt__Builders_commonKt.launch(Builders.common.kt:54)
at kotlinx.coroutines.BuildersKt.launch(Unknown Source)
at io.ktor.server.netty.NettyApplicationCallHandler.handleRequest(NettyApplicationCallHandler.kt:26)
at io.ktor.server.netty.NettyApplicationCallHandler.channelRead(NettyApplicationCallHandler.kt:20)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:56)
at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:365)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:416)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:515)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:834)
To me, It looks very similar to: https://github.com/ktorio/ktor/issues/1041 But it was about client, while we’ve issue with a server.
Expected behavior Doesn’t consume CPU without load.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:5 (2 by maintainers)
Top GitHub Comments
The bug was in ktor-client-apache and it should be fixed in Ktor 1.5.3
Any news about this issue? We are having the same problem in my company with a service that we have in production.
Could be related with #1018 ?