question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Occasionally Ktor utilizes 100% CPU without any load after failed request.

See original GitHub issue

Ktor Version and Engine Used (client or server and name) Our application uses both - Ktor server and client. Versions are ktor: 1.2.3 kotlin: 1.3.50 server’s engine: Netty client’s engine: CIO

        // http server
        implementation "io.ktor:ktor-server-core:$ktor_version"
        implementation "io.ktor:ktor-server-netty:$ktor_version"
        implementation "io.ktor:ktor-server-host-common:$ktor_version"
        implementation "io.ktor:ktor-server-sessions:$ktor_version"
        implementation "io.ktor:ktor-auth:$ktor_version"
        implementation "io.ktor:ktor-jackson:$ktor_version"
        implementation "io.ktor:ktor-auth-jwt:$ktor_version"
        implementation "io.ktor:ktor-locations:$ktor_version"
        // http client
        implementation "io.ktor:ktor-client-core:$ktor_version"
        implementation "io.ktor:ktor-client-core-jvm:$ktor_version"
        implementation "io.ktor:ktor-client-apache:$ktor_version"
        implementation "io.ktor:ktor-client-cio:$ktor_version"
        implementation "io.ktor:ktor-client-json:$ktor_version"
        implementation "io.ktor:ktor-client-json-jvm:$ktor_version"
        implementation "io.ktor:ktor-client-jackson:$ktor_version"
        implementation "io.ktor:ktor-client-logging:$ktor_version"
        implementation "io.ktor:ktor-client-logging-jvm:$ktor_version"
        implementation "io.ktor:ktor-client-auth:$ktor_version"
        implementation "io.ktor:ktor-client-auth-jvm:$ktor_version"

Run on a VPC: Linux Debian 9.11

Describe the bug Monitoring reported that our server consumes 100%. The server does nothing (it’s test environment and issue happened later evening) but shows high CPU usage. In jstack, I found only one suspicious thread:

"Thread-8@16328" daemon prio=10 tid=0x6f nid=NA runnable
  java.lang.Thread.State: RUNNABLE
      at sun.nio.ch.EPoll.wait(EPoll.java:-1)
      at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
      at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
      - locked <0x3ff5> (a sun.nio.ch.EPollSelectorImpl)
      - locked <0x403d> (a sun.nio.ch.Util$2)
      at sun.nio.ch.SelectorImpl.selectNow(SelectorImpl.java:146)
      at io.ktor.network.selector.ActorSelectorManager.process(ActorSelectorManager.kt:81)
      at io.ktor.network.selector.ActorSelectorManager$process$1.invokeSuspend(ActorSelectorManager.kt:-1)
      at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
      at kotlinx.coroutines.DispatchedTask.run(Dispatched.kt:241)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      at java.lang.Thread.run(Thread.java:834)

Strace shows that some thread is performing a lot of epolls (all with same fd) and these polls are finished at once:

[pid  5938] 23:07:27.637413 epoll_wait(63, [], 1024, 0) = 0 <0.000014>
[pid  5938] 23:07:27.637473 epoll_wait(63, [], 1024, 0) = 0 <0.000016>
[pid  5938] 23:07:27.637538 epoll_wait(63, [], 1024, 0) = 0 <0.000015>
[pid  5938] 23:07:27.637598 epoll_wait(63, [], 1024, 0) = 0 <0.000043>
[pid  5938] 23:07:27.637687 epoll_wait(63, [], 1024, 0) = 0 <0.000016>
[pid  5938] 23:07:27.637739 epoll_wait(63, [], 1024, 0) = 0 <0.000015>
[pid  5938] 23:07:27.637799 epoll_wait(63, [], 1024, 0) = 0 <0.000015>
[pid  5938] 23:07:27.637860 epoll_wait(63, [], 1024, 0) = 0 <0.000016>

lsof shows: java 3405 root 63u a_inode 0,13 0 17139 [eventpoll]

To Reproduce According to monitoring system, usage spike happened at same time then an user tried to logout using expired session. In our system it will lead to throwing an exception and handing it with status page:

    install(StatusPages) {
    ...
        Exceptions.apply {
            httpExceptions(testing)
        }
    }

    @KtorExperimentalAPI
    fun StatusPages.Configuration.httpExceptions(testing: Boolean) {
        exception<HttpError> {
            logger.warn("${this.context.request.uri} - failed due to ${it.message}${it.cause?.let {"(caused by $it)"}}", it)
            if (it.code == HttpStatusCode.Unauthorized) {
                call.unauthorized(it.body)
            } else {
                it.body?.let { body ->
                    call.respond(it.code, body)
                } ?: call.respond(it.code)
            }
        }
        ... 

@KtorExperimentalAPI
suspend fun ApplicationCall.unauthorized(maybeError: HttpErrorBody? = null): Unit {
    // set WWW-Authenticate (as per RFC required for 401 status)
    val realm = application.environment.config.property("authentication.realm").getString()
    val header = HttpAuthHeader.Parameterized(AuthenticationScheme, mapOf(HttpAuthHeader.Parameters.Realm to realm))
    response.headers.append(HttpHeaders.WWWAuthenticate, header.toString())
    // clear session
    sessions.clear<MySessionCookie>()
    // return status code and body
    maybeError?.let { body ->
        respond(HttpStatusCode.Unauthorized, body)
    } ?: respond(HttpStatusCode.Unauthorized)
}

In log, we have:

2019-11-08 19:31:58,277 DEBUG [nioEventLoopGroup-4-1][172.26.0.8][REQ-496] auth - session 8670142f-97a2-4035-9f17-e62774e0a7c5 has expired
2019-11-08 19:31:58,277 INFO  [nioEventLoopGroup-4-1][172.26.0.8][REQ-496] application - finished POST /web/v1/logout with null
2019-11-08 19:31:58,278 WARN  [nioEventLoopGroup-4-1][172.26.0.8][REQ-496] application - /web/v1/logout - failed due to SessionExpired com.example.http.HttpError$UnauthorizedAccess: SessionExpired
        at com.example.ApplicationKt$module$12$$special$$inlined$session$lambda$1.invokeSuspend(Application.kt:297)
        at com.example.ApplicationKt$module$12$$special$$inlined$session$lambda$1.invoke(Application.kt)
        at com.example.ApplicationKt$module$12$$special$$inlined$session$1.invokeSuspend(SessionAuth.kt:156)
        at com.example.ApplicationKt$module$12$$special$$inlined$session$1.invoke(SessionAuth.kt)
        at io.ktor.util.pipeline.SuspendFunctionGun.loop(PipelineContext.kt:268)
        at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(PipelineContext.kt:67)
        at io.ktor.util.pipeline.SuspendFunctionGun.proceed(PipelineContext.kt:141)
        at io.ktor.util.pipeline.SuspendFunctionGun.execute(PipelineContext.kt:161)
        at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:27)
        at io.ktor.auth.Authentication.processAuthentication(Authentication.kt:228)
        at io.ktor.auth.Authentication$interceptPipeline$2.invokeSuspend(Authentication.kt:123)
        at io.ktor.auth.Authentication$interceptPipeline$2.invoke(Authentication.kt)
        at io.ktor.util.pipeline.SuspendFunctionGun.loop(PipelineContext.kt:268)
        at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(PipelineContext.kt:67)
        at io.ktor.util.pipeline.SuspendFunctionGun.proceed(PipelineContext.kt:141)
        at io.ktor.util.pipeline.SuspendFunctionGun.execute(PipelineContext.kt:161)
        at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:27)
        at io.ktor.routing.Routing.executeResult(Routing.kt:147)
        at io.ktor.routing.Routing.interceptor(Routing.kt:34)
        at io.ktor.routing.Routing$Feature$install$1.invokeSuspend(Routing.kt:99)
        at io.ktor.routing.Routing$Feature$install$1.invoke(Routing.kt)
        at io.ktor.util.pipeline.SuspendFunctionGun.loop(PipelineContext.kt:268)
        at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(PipelineContext.kt:67)
        at io.ktor.util.pipeline.SuspendFunctionGun.proceed(PipelineContext.kt:141)
        at io.ktor.features.ContentNegotiation$Feature$install$1.invokeSuspend(ContentNegotiation.kt:106)
        at io.ktor.features.ContentNegotiation$Feature$install$1.invoke(ContentNegotiation.kt)
        at io.ktor.util.pipeline.SuspendFunctionGun.loop(PipelineContext.kt:268)
        at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(PipelineContext.kt:67)
        at io.ktor.util.pipeline.SuspendFunctionGun.proceed(PipelineContext.kt:141)
        at io.ktor.features.StatusPages$interceptCall$2.invokeSuspend(StatusPages.kt:98)
        at io.ktor.features.StatusPages$interceptCall$2.invoke(StatusPages.kt)
        at kotlinx.coroutines.intrinsics.UndispatchedKt.startUndispatchedOrReturn(Undispatched.kt:91)
        at kotlinx.coroutines.CoroutineScopeKt.coroutineScope(CoroutineScope.kt:180)
        at io.ktor.features.StatusPages.interceptCall(StatusPages.kt:97)
        at io.ktor.features.StatusPages$Feature$install$2.invokeSuspend(StatusPages.kt:137)
        at io.ktor.features.StatusPages$Feature$install$2.invoke(StatusPages.kt)
        at io.ktor.util.pipeline.SuspendFunctionGun.loop(PipelineContext.kt:268)
        at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(PipelineContext.kt:67)
        at io.ktor.util.pipeline.SuspendFunctionGun.proceed(PipelineContext.kt:141)
         at io.ktor.features.CallLogging$Feature$install$1$invokeSuspend$$inlined$withMDC$1.invokeSuspend(CallLogging.kt:226)
        at io.ktor.features.CallLogging$Feature$install$1$invokeSuspend$$inlined$withMDC$1.invoke(CallLogging.kt)
        at kotlinx.coroutines.intrinsics.UndispatchedKt.startUndispatchedOrReturn(Undispatched.kt:91)
        at kotlinx.coroutines.BuildersKt__Builders_commonKt.withContext(Builders.common.kt:156)
        at kotlinx.coroutines.BuildersKt.withContext(Unknown Source)
        at io.ktor.features.CallLogging$Feature$install$1.invokeSuspend(CallLogging.kt:230)
        at io.ktor.features.CallLogging$Feature$install$1.invoke(CallLogging.kt)
        at io.ktor.util.pipeline.SuspendFunctionGun.loop(PipelineContext.kt:268)
        at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(PipelineContext.kt:67)
        at io.ktor.util.pipeline.SuspendFunctionGun.proceed(PipelineContext.kt:141)
        at io.ktor.util.pipeline.SuspendFunctionGun.execute(PipelineContext.kt:161)
        at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:27)
        at io.ktor.server.engine.DefaultEnginePipelineKt$defaultEnginePipeline$2.invokeSuspend(DefaultEnginePipeline.kt:118)
        at io.ktor.server.engine.DefaultEnginePipelineKt$defaultEnginePipeline$2.invoke(DefaultEnginePipeline.kt)
        at io.ktor.util.pipeline.SuspendFunctionGun.loop(PipelineContext.kt:268)
        at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(PipelineContext.kt:67)
        at io.ktor.util.pipeline.SuspendFunctionGun.proceed(PipelineContext.kt:141)
        at io.ktor.util.pipeline.SuspendFunctionGun.execute(PipelineContext.kt:161)
        at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:27)
        at io.ktor.server.netty.NettyApplicationCallHandler$handleRequest$1.invokeSuspend(NettyApplicationCallHandler.kt:36)
        at io.ktor.server.netty.NettyApplicationCallHandler$handleRequest$1.invoke(NettyApplicationCallHandler.kt)
        at kotlinx.coroutines.intrinsics.UndispatchedKt.startCoroutineUndispatched(Undispatched.kt:55)
        at kotlinx.coroutines.CoroutineStart.invoke(CoroutineStart.kt:111)
        at kotlinx.coroutines.AbstractCoroutine.start(AbstractCoroutine.kt:154)
        at kotlinx.coroutines.BuildersKt__Builders_commonKt.launch(Builders.common.kt:54)
        at kotlinx.coroutines.BuildersKt.launch(Unknown Source)
        at io.ktor.server.netty.NettyApplicationCallHandler.handleRequest(NettyApplicationCallHandler.kt:26)
        at io.ktor.server.netty.NettyApplicationCallHandler.channelRead(NettyApplicationCallHandler.kt:20)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
        at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:56)
        at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:365)
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:416)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:515)
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:834)

To me, It looks very similar to: https://github.com/ktorio/ktor/issues/1041 But it was about client, while we’ve issue with a server.

Expected behavior Doesn’t consume CPU without load.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
e5lcommented, Apr 5, 2021

The bug was in ktor-client-apache and it should be fixed in Ktor 1.5.3

0reactions
dimartiro-pycommented, Aug 27, 2020

Any news about this issue? We are having the same problem in my company with a service that we have in production.

Could be related with #1018 ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Occasionally Ktor utilizes 100% CPU without any load after ...
Occasionally Ktor utilizes 100 % CPU without any load after failed request. ... But it was about client, while we've issue with a...
Read more >
CPU usage at 100% Ktor HTTP Client CIO Engine #1041
Randomly, my process containing a ktor CIO client will show 100% CPU usage. Upon investigation of thread-level CPU usage, I can see that ......
Read more >
Occasionally Ktor utilizes 100% CPU without any load after ...
Occasionally Ktor utilizes 100 % CPU without any load after failed request. ... Our application uses both - Ktor server and client.
Read more >
Ktor/Netty EventLoopGroup question : r/Kotlin - Reddit
I have a Ktor app using a Netty Engine and a postgres database using JDBC (no ORM). I have a route that handles...
Read more >
WhatsNew 2.1 | Ktor Framework
Kotlin Server and Client Framework for microservices, HTTP APIs, and RESTful services.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found