Intermittent build failures - read timed out
See original GitHub issueYour checklist for this issue
-
Jenkins version
-
Plugin version
-
Bitbucket cloud
-
Bitbucket server and version
Description
Jenkins version: 2.176.2 Plugin version: 2.4.5 Bitbucket cloud
We are experiencing issues with the bitbucket-branch-source-plugin
. From time to time, our builds fail with the following exception. This mainly seems to happen at night when running jobs via the cron trigger. What we suspect is that the http connection pool used by the BitbucketCloudApiClient
contains stale connections when they have not been used for several minutes/hours.
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
at com.cloudbees.jenkins.plugins.bitbucket.client.BitbucketCloudApiClient.executeMethod(BitbucketCloudApiClient.java:769)
at com.cloudbees.jenkins.plugins.bitbucket.client.BitbucketCloudApiClient.getRequestAsInputStream(BitbucketCloudApiClient.java:792)
Caused: java.io.IOException: Communication error for url: https://api.bitbucket.org/2.0/repositories/[redacted]/[redacted]/refs/branches
at com.cloudbees.jenkins.plugins.bitbucket.client.BitbucketCloudApiClient.getRequestAsInputStream(BitbucketCloudApiClient.java:811)
at com.cloudbees.jenkins.plugins.bitbucket.client.BitbucketCloudApiClient.getRequest(BitbucketCloudApiClient.java:816)
at com.cloudbees.jenkins.plugins.bitbucket.client.BitbucketCloudApiClient.getBranchesByRef(BitbucketCloudApiClient.java:457)
at com.cloudbees.jenkins.plugins.bitbucket.client.BitbucketCloudApiClient.getBranches(BitbucketCloudApiClient.java:449)
at com.cloudbees.jenkins.plugins.bitbucket.BitbucketSCMSource.retrieve(BitbucketSCMSource.java:812)
at jenkins.scm.api.SCMSource.fetch(SCMSource.java:582)
at org.jenkinsci.plugins.workflow.multibranch.SCMBinder.create(SCMBinder.java:98)
at org.jenkinsci.plugins.workflow.job.WorkflowRun.run(WorkflowRun.java:293)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Issue Analytics
- State:
- Created 4 years ago
- Comments:12 (5 by maintainers)
Top Results From Across the Web
Gradle publish task intermittently fails with "Read timed out" error
We are using Gradle running in Bitbucket Cloud to publish built artifacts to OSS Nexus hosted in our own colo.
Read more >java.net.SocketTimeoutException: Read timed out under Tomcat
Server is trying to read data from the request, but its taking longer than the timeout value for the data to arrive from...
Read more >Intermittent read timeout errors when running tests against Auth0
When we run our test suite, which makes hundreds of calls to Auth0, we get intermittent read timeout errors - perhaps 1 out...
Read more >Troubleshooting intermittent connection timeout
Both cases indicate this connection issue was caused by application server failed to complete the handshaking process within timeout threshold.
Read more >intermittent network request timed out on iOS 13.4.1+
Intermittent failures to a server can come from a variety of different reasons. Here are a few to debug this type of issue:...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
So far, no problems on my end after this update. Thanks for your work @timpeeters and all involved!
After one week of running with a patched version of the plugin, I can confirm that for us, the failed scans, checkouts are no longer appearing. So for us, the patch with the retry for idempotent operations worked. I think this is a good addition to the code base because HTTP requests can fail for a large amount of reasons and having a retry for idempotent operations is safe and a common practice.
I understand that for the non-idempotent (e.g. POST) requests we might need to look for a different solution. My gut feeling is also pointing in the direction of stale connections in the pool. I’ll try with a local build where I will disable connection re-use altogether, just to confirm that this is indeed the source of the problems. I’ll create a separate Github issue and PR if I find out more.
However I would appreciate to treat this issue and PR for idempotent requests separately. Is that a reasonable idea?