question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Sporadic OkHttp errors after upgrading to ktor 1.3.1

See original GitHub issue

Ktor Version and Engine Used (client or server and name)

implementation("io.ktor:ktor-client-core-jvm:1.3.1")
implementation("io.ktor:ktor-client-core:1.3.1")
implementation("io.ktor:ktor-client-jackson:1.3.1")
implementation("io.ktor:ktor-client-logging-jvm:1.3.1")
implementation("io.ktor:ktor-client-okhttp:1.3.1")
implementation("io.ktor:ktor-jackson:1.3.1")
implementation("io.ktor:ktor-metrics-micrometer:1.3.1")
implementation("io.ktor:ktor-metrics:1.3.1")
implementation("io.ktor:ktor-server-host-common:1.3.1")
implementation("io.ktor:ktor-server-netty:1.3.1")

Describe the bug

We’ve started seeing sporadic OkHttp exceptions in our tests when we upgraded from 1.3.0 to 1.3.1. Upon downgrading back to 1.3.0 (with no other changes) the tests were fine again.

The exception is:

java.io.EOFException: \n not found: limit=0 content=…
	at o.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:240)
	at o.i.h.Http1ExchangeCodec.readHeaderLine(Http1ExchangeCodec.java:242)
	at o.i.h.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.java:213)
	... 20 common frames omitted
Wrapped by: java.io.IOException: unexpected end of stream on http://called-service/...
	at o.i.h.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.java:236)
	at o.i.c.Exchange.readResponseHeaders(Exchange.java:115)
	at o.i.h.CallServerInterceptor.intercept(CallServerInterceptor.java:94)
	at o.i.h.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
	at o.i.c.ConnectInterceptor.intercept(ConnectInterceptor.java:43)
	at o.i.h.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
	at o.i.h.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
	at o.i.c.CacheInterceptor.intercept(CacheInterceptor.java:94)
	at o.i.h.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
	at o.i.h.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
	at o.i.h.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
	at o.i.h.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
	at o.i.h.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:88)
	at o.i.h.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
	at o.i.h.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
	at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:221)
	at o.RealCall$AsyncCall.execute(RealCall.java:172)
	at o.i.NamedRunnable.run(NamedRunnable.java:32)
	at j.u.c.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at j.u.c.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.lang.Thread.run(Thread.java:834)

It looks like the HTTP client receives an EOF while reading the response headers from calling http://called-service/… I don’t think the culprit is the remote web server though, because this server is called from other services and it’s only the one with ktor 1.3.1 that throws these errors. And this one is also fine with ktor 1.3.0.

This issue has already been raised before in the OkHttp repository, for example here.

From what I can see, both ktor 1.3.0 and 1.3.1 both depend on com.squareup.okhttp3:okhttp:3.14.2, so maybe this is due to a change in the way ktor uses OkHttp?

To Reproduce

Not easy to reproduce, unfortunately, because the error only occurs in a small percentage of requests (~5%). I’m hoping someone else observes this issue and can come up with a reproducible test.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:7
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

14reactions
fluidsoniccommented, Apr 6, 2020

I’ve also got this issue.

  • Ktor 1.3.2
  • OkHttp 4.4.1
  • Ktor config is as simple as HttpClient(OkHttp).

The issues does not occur if I send a Connection: close header with each request. Given that it’s likely an issue related to connection reuse.

I’ve managed to reproduce it on Android as well as on a regular JVM. With the following code, the second request fails 100% for me:

HttpClient(OkHttp).use { httpClient ->
    coroutineScope {
        repeat(10) {
            try {
                val data = httpClient.request<ByteArray>("https://***hidden***")
                println("Received ${data.size} bytes.")
            } catch (e: Throwable) {
                e.printStackTrace()
            }

            delay(5_000)
        }
    }
}

Unfortunately I cannot share the URL (publicly).

If the delay between requests is >= 5 seconds and < 5 minutes, then every other request will fail.

  • 5 minutes is the default maximum keep-alive time of OkHttpClient.
  • 5 seconds is probably some server-internal keep-alive limit.

My guess is that the request fails with that EOFException if a connection is still in the connection pool for reuse but once it’s actually used the server closes it because it has timed out server-side. Due to that only every other request is affected.

What’s confusing is that I don’t have that issue if I use OkHttpClient directly. The same test (request + 5 second waiting + request) works perfectly fine there, so the problem is limited to Ktor for me.

Update

Ktor automatically sets OkHttpClient’s retryOnConnectionFailure to false. As per its documentation, this is one of the cases that is handled by the retry logic:

Stale pooled connections. The ConnectionPool reuses sockets to decrease request latency, but these connections will occasionally time out.

When I set retryOnConnectionFailure to false in an OkHttpClient I can reproduce the issue there even without Ktor!

So the workaround is pretty simple:

HttpClient(OkHttp) {
    engine {
        config {
            retryOnConnectionFailure(true)
        }
    }
}
1reaction
paderickcommented, Mar 16, 2021

This issue still persist on ktor release 1.5.2.

A work-around can be a custom ConnectionPool like the following:

HttpClient(OkHttp) {
    engine {
        clientCacheSize = 0
        config {
            connectionPool(ConnectionPool(5, 10, TimeUnit.SECONDS))
        }
    }
}

The default should be ConnectionPool(5, 5, TimeUnit.MINUTES).

Read more comments on GitHub >

github_iconTop Results From Across the Web

OKHttp Thread Leak when using retryOnConnectionFailure.
I'm using ktor httpclient with okhttp on spring boot project. and make some new api connection and i met this error. so make...
Read more >
WhatsNew 1.3 | Ktor Framework
OkHttp WebSocket hangs when computer goes to sleep on Linux ... Sporadic OkHttp errors after upgrading to ktor 1.3.1.
Read more >
WhatsNew 2.1 | Ktor Framework
JVM Ktor 1.3.1 (which is latest stable ktor-client-websockets version as of writing this), OkHttp engine on both server and client.
Read more >
WhatsNew 1.6 | Ktor Framework
OkHttp and iOS: request with only-if-cache directive in ... Darwin and Kotlin/JS: "List has more than one element" error when header like ...
Read more >
WhatsNew 2.0 | Ktor Framework
Darwin and Kotlin/JS: "List has more than one element" error when header like Content-type is duplicated in a response.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found