question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ConflictException from docker-java when starting many containers at once

See original GitHub issue
  • docker-plugin version you use: docker-plugin-1.1.4
  • jenkins version you use: 2.107.1
  • docker engine version you use: Version = swarm/1.2.8, API Version = 1.22, Docker 17.09.0-ce
  • stack trace / logs / any technical details that could help diagnose this issue

We are using the docker-plugin for basically all builds in a fairly large Jenkins installation (~70k builds / day). We infrequently (~5/day) see the following in the Jenkins log, followed by an automatic 5-minute “disabling” of a docker template by the plugin:

May 09, 2018 3:46:03 PM com.nirima.jenkins.plugins.docker.DockerCloud$1 run
SEVERE: Error in provisioning; template='DockerTemplate{configVersion=2, labelString='old-sl sl swarm_node sl_swarm_node sl_always_online sl_swarm_node_latest', connector=io.jenkins.docker.connector.DockerComputerSSHConnector@36cee2cb, remoteFs='/home/jenkins', instanceCap=500, mode=EXCLUSIVE, retentionStrategy=com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy@3ec64559, dockerTemplateBase=DockerTemplateBase{image='taas-docker-public.artifactory.swg-devops.com/swarm/dk-jenkins-slave:latest', pullCredentialsId='public.shared.artifactory.docker.registry.username.password', registry=null, dockerCommand='./setup_slave.sh', hostname='', dnsHosts=[9.9.9.9], network='', volumes=[], volumesFrom2=[], environment=[], bindPorts='', bindAllPorts=false, memoryLimit=null, memorySwap=null, cpuShares=null, privileged=true, tty=false, macAddress='null', extraHosts=[]}, removeVolumes=true, pullStrategy=PULL_NEVER, nodeProperties=[], disabled=BySystem,1 ms,4 min 59 sec,Template provisioning failed.}' for cloud='taas-internal-swarm'
com.github.dockerjava.api.exception.ConflictException: Conflict: The name 25bf9974eba5b0 is already assigned. You have to delete (or rename) that container to be able to assign 25bf9974eba5b0 to a container again.

	at com.github.dockerjava.netty.handler.HttpResponseHandler.channelRead0(HttpResponseHandler.java:107)
	at com.github.dockerjava.netty.handler.HttpResponseHandler.channelRead0(HttpResponseHandler.java:33)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:438)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
	at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:253)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:748)

I suspect the problem lies in how the plugin chooses a name for each container: https://github.com/jenkinsci/docker-plugin/blob/master/src/main/java/com/nirima/jenkins/plugins/docker/DockerTemplate.java#L492-L495. The container names are chosen by hashing System.nanoTime(), but per https://docs.oracle.com/javase/8/docs/api/java/lang/System.html#nanoTime-- the available resolution for System.nanoTime() is only guaranteed to be “least as good as that of currentTimeMillis(),” and so we may be finding a collision there.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:9 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
pjdartoncommented, May 21, 2018

I’ve updated PR #651 so that the 5-minute back-off period is now configurable.

0reactions
pjdartoncommented, May 24, 2018

Code changes #651 have been merged, so this will be FITR.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Self-hosted Runner failing: An error occurred whil...
Hi, I'm having issues when starting builds on the self-hosted runner with the following `error: An error occurred whilst creating container exec.`.
Read more >
Can Testcontainers join existing network? - Stack Overflow
withNetwork(network) .start();. When I run step 3 i get: Caused by: com.github.dockerjava.api.exception.ConflictException: ...
Read more >
yet-another-docker-0.1.4 docker cloud does not work with ...
ConflictException: {"message":"You cannot remove a running container ... Stop the container before attempting removal or force remove"} at ...
Read more >
10 best practices to build Java containers with Docker - Snyk
Most blog articles we've seen start and finish along the lines of the ... I am already using a small image for my...
Read more >
Run your image as a container - Docker Documentation
To run an image inside a container, we use the docker run command. The docker run command requires one parameter which is the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found