question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

losing connection with java.io.IOException: Unexpected termination of the channel after a few minutes

See original GitHub issue

Setup

  • Docker-plugin: version: 1.1.5
  • Jenkins version: 2.121.3 (LTS) and 2.136 weekly
  • Docker engine version: 18.06.0-ce, API Version = 1.38

We are using a custom image based on the latest jenkins/ssh-slave. Agents are started with attach connect method and run with a freestyle job.

Symptom

We are using the agents to do automatic testing with Robot Framework. After updating the Docker plugin to version 1.1.5 our agents fail every time after a few minutes with the console log:

java.io.EOFException
	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638)
	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3113)
	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853)
	at java.io.ObjectInputStream.<init>(ObjectInputStream.java:349)
	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
	at hudson.remoting.Command.readFrom(Command.java:140)
	at hudson.remoting.Command.readFrom(Command.java:126)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
Caused: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
Caused: java.io.IOException: Backing channel 'docker-002g9n6u4hpkj' is disconnected.
	at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214)
	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283)
	at com.sun.proxy.$Proxy222.isAlive(Unknown Source)
	at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1143)
	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1135)
	at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
	at hudson.model.Build$BuildExecution.build(Build.java:206)
	at hudson.model.Build$BuildExecution.doRun(Build.java:163)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
	at hudson.model.Run.execute(Run.java:1815)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:97)
	at hudson.model.Executor.run(Executor.java:429)
FATAL: Unable to delete script file /tmp/jenkins5950680125503140381.sh
java.io.EOFException
	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638)
	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3113)
	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853)
	at java.io.ObjectInputStream.<init>(ObjectInputStream.java:349)
	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
	at hudson.remoting.Command.readFrom(Command.java:140)
	at hudson.remoting.Command.readFrom(Command.java:126)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
Caused: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
Caused: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on docker-002g9n6u4hpkj failed. The channel is closing down or has closed down
	at hudson.remoting.Channel.call(Channel.java:948)
	at hudson.FilePath.act(FilePath.java:1070)
	at hudson.FilePath.act(FilePath.java:1059)
	at hudson.FilePath.delete(FilePath.java:1563)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:123)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
	at hudson.model.Build$BuildExecution.build(Build.java:206)
	at hudson.model.Build$BuildExecution.doRun(Build.java:163)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
	at hudson.model.Run.execute(Run.java:1815)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:97)
	at hudson.model.Executor.run(Executor.java:429)

And the Jenkins log shows:

I/O error in channel docker-002h28zxulg3r
java.io.EOFException
	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638)
	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3113)
	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853)
	at java.io.ObjectInputStream.<init>(ObjectInputStream.java:349)
	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
	at hudson.remoting.Command.readFrom(Command.java:140)
	at hudson.remoting.Command.readFrom(Command.java:126)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
Caused: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)

INFO io.jenkins.docker.DockerTransientNode$1 println
Disconnected computer for node 'docker-002h28zxulg3r'.
INFO io.jenkins.docker.DockerTransientNode$1 println
Removed Node for node 'docker-002h28zxulg3r'.
SEVERE com.github.dockerjava.core.async.ResultCallbackTemplate onError
Error during callback
com.github.dockerjava.api.exception.NotModifiedException: 
	at com.github.dockerjava.netty.handler.HttpResponseHandler.channelRead0(HttpResponseHandler.java:97)
	at com.github.dockerjava.netty.handler.HttpResponseHandler.channelRead0(HttpResponseHandler.java:33)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:438)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
	at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:253)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:997)
	at io.netty.channel.epoll.EpollDomainSocketChannel$EpollDomainUnsafe.epollInReady(EpollDomainSocketChannel.java:138)
	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:401)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:306)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:748)

If the plugin version is downgraded to 1.1.4 everything works as it used to and no connection loss appear. Tried also to set the system parameter com.nirima.jenkins.plugins.docker.DockerContainerWatchdog.enabled=false but it had no effect on this issue.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:3
  • Comments:28 (16 by maintainers)

github_iconTop GitHub Comments

13reactions
pjdartoncommented, Apr 23, 2019

OK, so to summarize for anyone else who stumbles upon this issue in search of a solution…

This was caused by having another Jenkins server with the same instance ID that was using the same docker host. That was the magic combination.

  • If you’ve got multiple Jenkins servers using the same docker host, you MUST ensure that they have different instance IDs, otherwise the background cleanup DockerContainerWatchdog will kill the other Jenkins server’s containers.
  • If you can’t give them different instance IDs then you’ll need to set system parameter com.nirima.jenkins.plugins.docker.DockerContainerWatchdog.enabled=false on ALL the Jenkins servers sharing that ID that share the same docker hosts.
2reactions
dbnicholsoncommented, Apr 19, 2019

I can only assume that, somehow, your two Jenkins servers have the same ID … or your docker daemon is ignoring the filter it’s given.

You are correct. I figured that out later. When I created our development setup I copied the production data including the instance ID key. I regenerated that now following the advice at https://support.cloudbees.com/hc/en-us/articles/215578778-How-to-fix-an-instance-having-an-invalid-identity-key-enc-file-. I’ll test out whether this works correctly now. Sorry for all the noise!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Unexpected termination of the channel after a few minutes ...
losing connection with java.io.IOException: Unexpected termination of the channel after a few minutes. This is happening randomley ...
Read more >
java.io.IOException: Unexpected termination of the channel
We are seeing slaves being displayed as offline with an Unexpected termination of channel IO exception. The slaves are actually running as seen...
Read more >
Jenkins slave jobs failing on "Unexpected termination of ...
You'll see that error if the Jenkins master loses connectivity with the slave. It could be due to any of the three issues...
Read more >
remoting issue - Google Groups
Ubuntu 16.04.5 ... ERROR: Socket connection to SSH server was lost java.net. ... java.io.IOException: Unexpected termination of the channel
Read more >
OSX Jenkins Slave terminates with java.io.IOException
io.IOException: Unexpected termination of the channel . Despite that, we still experience this exception. Strange is, that the exception is not ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found