How to avoid unexpected end of stream errors?
See original GitHub issueDescribe the issue
When performing a lot of requests in a highly concurrent environment we sometimes get the following error :
java.io.IOException: unexpected end of stream on https://s3.us-east-2.amazonaws.com/
We have a lot of coroutines that can send data to s3 concurrently and they all use the same aws.sdk.kotlin.services.s3.S3Client
instance for performance reasons and the connection is not closed after a request because otherwise the other requests would not work.
I saw on this post that a workaround is to add a header .header("Connection", "close")
.
Is it possible to add this header using the AWS sdk? Or is there another way to avoid this error?
Sometimes we can send 10 000 requests without any issue, sometimes a few of them fail with the stacktrace I posted.
The failures are not in a specific order meaning we can have SUCCESS FAILURE SUCCESS (eg: if we send 3 requests) or we can have all success etc.
Thanks
Steps to Reproduce
- Create 30+ coroutines that send PUT requests to S3 in a loop for a prolonged period of time (20 min+).
- Notice the exception
Current behavior
11:02:51.212 [DefaultDispatcher-worker-10] ERROR aws.smithy.kotlin.runtime.http.engine.ktor.KtorEngine - throwing
java.io.IOException: unexpected end of stream on https://s3.us-east-2.amazonaws.com/...
at okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:202)
at okhttp3.internal.connection.Exchange.readResponseHeaders(Exchange.kt:106)
at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.kt:79)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:34)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201)
at okhttp3.internal.connection.RealCall$AsyncCall.run(RealCall.kt:517)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.EOFException: \n not found: limit=0 content=…
at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.kt:332)
at okhttp3.internal.http1.HeadersReader.readLine(HeadersReader.kt:29)
at okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:178)
... 16 common frames omitted
11:02:51.214 [DefaultDispatcher-worker-10] DEBUG Retry - sdkRequestId: 23b27b9e-d630-4e3f-af6b-ec0e08ddd764; service: S3; operation: PutObject; - request failed with non-retryable error
11:02:51.215 [DefaultDispatcher-worker-20] ERROR aws.smithy.kotlin.runtime.http.engine.ktor.KtorEngine - throwing
java.util.concurrent.CancellationException: Parent job is Cancelling
at aws.smithy.kotlin.runtime.http.engine.CoroutineUtilsKt$attachToOuterJob$cleanupHandler$1.invoke(CoroutineUtils.kt:39)
at aws.smithy.kotlin.runtime.http.engine.CoroutineUtilsKt$attachToOuterJob$cleanupHandler$1.invoke(CoroutineUtils.kt:37)
at kotlinx.coroutines.InvokeOnCancelling.invoke(JobSupport.kt:1457)
at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1499)
at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:795)
at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:755)
at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:671)
at kotlinx.coroutines.JobSupport.parentCancelled(JobSupport.kt:637)
at kotlinx.coroutines.ChildHandleNode.invoke(JobSupport.kt:1465)
at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1499)
at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:795)
at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:755)
at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:671)
at kotlinx.coroutines.JobSupport.childCancelled(JobSupport.kt:651)
at kotlinx.coroutines.ChildHandleNode.childCancelled(JobSupport.kt:1466)
at kotlinx.coroutines.JobSupport.cancelParent(JobSupport.kt:358)
at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:332)
at kotlinx.coroutines.JobSupport.tryMakeCompletingSlowPath(JobSupport.kt:900)
at kotlinx.coroutines.JobSupport.tryMakeCompleting(JobSupport.kt:863)
at kotlinx.coroutines.JobSupport.makeCompletingOnce$kotlinx_coroutines_core(JobSupport.kt:828)
at kotlinx.coroutines.AbstractCoroutine.resumeWith(AbstractCoroutine.kt:100)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46)
at kotlinx.coroutines.UndispatchedCoroutine.afterResume(CoroutineContext.kt:147)
at kotlinx.coroutines.AbstractCoroutine.resumeWith(AbstractCoroutine.kt:102)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:104)
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)
Exception in thread "main" java.io.IOException: unexpected end of stream on https://s3.us-east-2.amazonaws.com/...
at okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:202)
at okhttp3.internal.connection.Exchange.readResponseHeaders(Exchange.kt:106)
at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.kt:79)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:34)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201)
at okhttp3.internal.connection.RealCall$AsyncCall.run(RealCall.kt:517)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.EOFException: \n not found: limit=0 content=…
at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.kt:332)
at okhttp3.internal.http1.HeadersReader.readLine(HeadersReader.kt:29)
at okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:178)
... 16 more
AWS Kotlin SDK version used
0.14.2-beta
Platform (JVM/JS/Native)
JVM
Operating System and version
Ubuntu 20.04.3 LTS
Issue Analytics
- State:
- Created a year ago
- Comments:5
Top GitHub Comments
I’ve looked at this a bit.
My initial investigation took me to this okhttp issue which links to this ktor issue as the culprit.
That issue suggests that it was happening because
retryOnConnectionFailure
was set tofalse
. This was resolved over a year ago in ktor though and I have verified that this setting is getting set totrue
.I have not otherwise been able to recreate this. Our next short term step here is going to be to enable additional logging that may help understand what is going on. This will be available hopefully in the next release and I’ll update this ticket with instructions.
Slightly longer term we are looking to remove
ktor
and bind directly tookhttp
as well us upgrade toktor-2.x
(which we still use for byte channel abstractions internally). I don’t expect either of these to fix the issue per say but making those changes will undoubtedly have potential to cause subtle changes in how requests are executed. This is more of an FYI than anything.⚠️COMMENT VISIBILITY WARNING⚠️
Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.