Agents fail to provision after restart
See original GitHub issueAfter Jenkins has been restarted, agents fail to provision with the following messages in the logs:
May 22, 2020 6:38:55 AM FINE com.cloudbees.jenkins.plugins.amazonecs.ECSLauncher
ECS: Launching agent
May 22, 2020 6:38:55 AM FINE com.cloudbees.jenkins.plugins.amazonecs.ECSLauncher
[ecs-cloud-ecs-main-fmcpc]: Creating Task in cluster null
May 22, 2020 6:38:55 AM WARNING com.cloudbees.jenkins.plugins.amazonecs.ECSLauncher launch
[ecs-cloud-ecs-main-fmcpc]: Error in provisioning; agent=com.cloudbees.jenkins.plugins.amazonecs.ECSSlave[ecs-cloud-ecs-main-fmcpc]
java.lang.NullPointerException
at com.cloudbees.jenkins.plugins.amazonecs.ECSService.registerTemplate(ECSService.java:150)
at com.cloudbees.jenkins.plugins.amazonecs.ECSLauncher.getTaskDefinition(ECSLauncher.java:205)
at com.cloudbees.jenkins.plugins.amazonecs.ECSLauncher.launch(ECSLauncher.java:107)
at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:292)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
May 22, 2020 6:38:55 AM FINER com.cloudbees.jenkins.plugins.amazonecs.ECSLauncher
[ecs-cloud-ecs-main-fmcpc]: Removing Jenkins node
All builds using ECS agents fail the same way. For context, we use Fargate agents in declarative pipelines, some with overrides on memory, cpu or image
Modifying and saving the agent config resolves the issue temporarily, but it returns as soon as Jenkins is restarted.
- Jenkins v2.222.3
- amazon-ecs-plugin v1.34
~The bit that caught my attention in the logs was Creating Task in cluster null - presumably that’s not a good sign? Any ideas why the cluster would be null after a restart?~ (this appears to be unrelated, even successful provisioning has this)
This only seems to have begun occurring after we upgraded from v1.26 of the plugin.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:11
- Comments:26 (5 by maintainers)
Top Results From Across the Web
Provisioning is failing after Agent instalation and reboot
Hello, I have a strange case on my provisioning, after installing the agent and reboot step, the client disappear from the provisioning task ......
Read more >Troubleshoot on-premises application provisioning
Restart the provisioning agent by going to the taskbar on your VM by searching for the Microsoft Azure AD Connect provisioning agent.
Read more >Unable to get provision certificate bytes for agent deployment
While attempting to deploy an agent the credentials test works, but the agent deployment fails with the message: Unable to deploy. agent ...
Read more >Unable to start SAP HANA Data Provisioning Agent
Try to uninstall the DP Agent and your Java and reinstall it again. This might resolve your issue.. Regards,. Ashish. Add a Comment ......
Read more >Troubleshooting Agent Provisioning KACE SMA Client ...
Description · Check for provisioning support files · Check for connectivity to the SMA appliance shares · Ping TARGETPC from the KACE K1000 ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Fix that worked for me: go to https://<jenkins>/configureClouds/ and click Save, then delete the nodes which were being created.
I think I have worked out the fix for at least one variation of this (the NPE on
registerTemplate).It was difficult to debug, as I believe the
ECSCloudclass is serialized; I could never get the debugger to pause on the constructor so concluded it may have been serialized. From my debugging ECSService would always end up with itsSupplier=nullafter restart. ThankfullyECSServiceis already lazy-loaded via a call toECSCloud.getEcsService(), so it doesn’t actually need to be preserved with ECSCloud. I switched that field to transient, and have had several successful restarts where I don’t encounter this error anymore.I’ve created PR #216 if anybody would like to test and verify they see the same success