Could not provision second build - Jenkins master hangs and need to be restarted
See original GitHub issueI did setup Docker Cloud with template: pointed to DOCKER_HOST URI and image in Manage Jenkins/Configure System/Cloud
When i trigger test job, first one finishes with success, but second build for the same job without any additional actions and changes is not triggered at all hanging in “(pending—Waiting for next available executor) state” in build queue.
docker-plugin: 1.0.2 jenkins: 2.73.2 docker engine:
Client:
Version: 17.09.0-ce
API version: 1.31 (downgraded from 1.32)
Go version: go1.8.3
Git commit: afdb6d4
Built: Tue Sep 26 22:40:09 2017
OS/Arch: darwin/amd64
Server:
Version: 17.07.0-ce
API version: 1.31 (minimum version 1.12)
Go version: go1.8.3
Git commit: 8784753
Built: Tue Aug 29 17:41:43 2017
OS/Arch: linux/amd64
Experimental: false
Stack trace: ( masked registry DNS name)
Oct 24, 2017 10:53:52 PM com.nirima.jenkins.plugins.docker.DockerCloud provision
INFO: Asked to provision 1 slave(s) for: null
Oct 24, 2017 10:53:52 PM com.nirima.jenkins.plugins.docker.DockerCloud provision
INFO: Will provision 'registry2.****/aurea/eng.build/jenkins/jnlp-slave:3.10-1-alpine', for label: 'null', in cloud: 'DLB-1'
Oct 24, 2017 10:53:52 PM com.nirima.jenkins.plugins.docker.DockerCloud addProvisionedSlave
INFO: Provisioning 'registry2.*****/aurea/eng.build/jenkins/jnlp-slave:3.10-1-alpine' number '0' on 'DLB-1'; Total containers: '174'
Oct 24, 2017 10:53:52 PM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
INFO: Started provisioning Image of registry2.****/aurea/eng.build/jenkins/jnlp-slave:3.10-1-alpine from DLB-1 with 1 executors. Remaining excess workload: 0
Oct 24, 2017 10:53:52 PM com.nirima.jenkins.plugins.docker.DockerCloud provisionFromTemplate
INFO: Trying to run container for registry2.****/aurea/eng.build/jenkins/jnlp-slave:3.10-1-alpine
Oct 24, 2017 10:54:16 PM hudson.TcpSlaveAgentListener$ConnectionHandler run
INFO: Accepted JNLP4-connect connection #2 from /172.18.0.130:38480
Oct 24, 2017 10:54:20 PM com.nirima.jenkins.plugins.docker.DockerCloud removeJobTemplate
WARNING: Couldn't remove template for job with id: 9
Oct 24, 2017 10:54:24 PM hudson.model.Run execute
INFO: TestJob #3 main build action completed: SUCCESS
Oct 24, 2017 10:54:24 PM com.nirima.jenkins.plugins.docker.DockerSlave _terminate
INFO: Disconnected computer
Oct 24, 2017 10:54:24 PM jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed
WARNING: Computer.threadPoolForRemoting [#25] for DLB-1-ae2be65a0456 terminated
java.nio.channels.ClosedChannelException
at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)
at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)
at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800)
at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)
at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:311)
at hudson.remoting.Channel.close(Channel.java:1403)
at hudson.remoting.Channel.close(Channel.java:1356)
at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:708)
at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:96)
at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:626)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Oct 24, 2017 10:54:40 PM com.nirima.jenkins.plugins.docker.DockerSlave$2 run
INFO: Stopped container ae2be65a04566fd1a9375d8f05edba0a8644eaa764aeaad81fc46b41d7b74ac3
Oct 24, 2017 10:54:40 PM com.nirima.jenkins.plugins.docker.DockerSlave$2 run
INFO: Shutdowned slave for ae2be65a04566fd1a9375d8f05edba0a8644eaa764aeaad81fc46b41d7b74ac3
Oct 24, 2017 10:54:41 PM com.nirima.jenkins.plugins.docker.DockerSlave$2 run
INFO: Removed container ae2be65a04566fd1a9375d8f05edba0a8644eaa764aeaad81fc46b41d7b74ac3
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:22 (10 by maintainers)
Top Results From Across the Web
Jenkins hangs due to "Running CpsFlowExecution ...
The Web UI just hangs until nginx times out. The Java process will then refuse to stop when I try to restart the...
Read more >Jenkins job hangs and cannot be killed after aborting DSL ...
The DSL build step running with the patched HPI should abort immediately when the job is aborted. So nothing should hang. If the...
Read more >Slave hung in startup phase with missing logging in the GUI
A manual restart had no effect. I decided to put Jenkins in shutdown mode, stop all running jobs, flush the queues and stop...
Read more >Jobs stuck in queue "Jenkins doesn't have label ..."
I can see that some slave nodes (containers) coming up online for a split second then disappears, but then the job(s) gets stuck...
Read more >[JENKINS-69241] Agent disconnects randomly
Unless significantly more information is provided, I'll close the issue as "cannot reproduce". Your issue report needs: Confirm that the same ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

You’re right that
plannedCapacitySnapshotis wrong here, need to investigate how to force this being updated. I’ve been able to reproduce this error after interrupting a build. Investigating …++ Thanks Nicolas