question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Request] Support ECS cluster auto scaling configurations

See original GitHub issue

I know this was already somewhat addressed in #151, however the plugin does not seem to support cluster auto scaling configurations where desired count is zero.

I have an EC2-backed cluster with cluster auto scaling configured with a desired count of zero on the underlying Auto Scaling group. If I run a task within that cluster using the RunTask API action, the task goes into a PROVISIONING state while ECS tells the ASG an instance is needed. The task then runs once an instance is fully booted and available within the cluster.

With this setup, however, the plugin throws an error if it sees no container instances available in the cluster:

2020-05-20 15:48:03.187+0000 [id=266]	WARNING	c.c.j.p.amazonecs.ECSLauncher#launch: [JenkinsBuild-ec2-generic-fhldc]: Error in provisioning; agent=com.cloudbees.jenkins.plugins.amazonecs.ECSSlave[JenkinsBuild-ec2-generic-fhldc]
com.amazonaws.services.ecs.model.InvalidParameterException: No Container Instances were found in your cluster. (Service: AmazonECS; Status Code: 400; Error Code: InvalidParameterException; Request ID: ...)

Would it be possible to support this scenario by detecting the presence of an ECS capacity provider with cluster auto scaling and, instead of failing, waiting for ECS to do its thing?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:8
  • Comments:8 (1 by maintainers)

github_iconTop GitHub Comments

2reactions
bryanburkecommented, Jun 22, 2020

I did some more thorough testing with ECS cluster auto scaling. The plugin’s current requirement of launchType not only breaks configurations where no instances are running in the cluster, it also breaks scale-out entirely when managed scaling is enabled, even with instances running in the cluster already. Here is an example:

Configuration

  • ASG
    • Desired count: 1
    • Minimum size: 1
    • Maximum size: 5
  • Capacity provider
    • Assigned as default capacity provider strategy for cluster.
    • Managed scaling: enabled
      • Minimum scaling step size: 1
      • Maximum scaling step size: 10000
      • Target capacity: 100
    • Managed termination protection: enabled
  • Task definition
    • cpu: 0.25
    • memoryReservation: 512
  • Total number of running tasks per instance (t3a.medium): 7

Desired Behavior

  1. Queue 10 build tasks in Jenkins.
  2. Jenkins runs all 10 tasks in ECS.
  3. ECS places first 7 on running instance.
  4. ECS places remaining 3 in PROVISIONING status.
  5. ECS managed scaling triggers ASG scale-out to 2 instances.
  6. After second instance registers with cluster, ECS places reminaing 3 tasks on it.

Actual Behavior

  1. Queue 10 build tasks in Jenkins.
  2. Jenkins runs 7 tasks in ECS. Remaining 3 stay queued in Jenkins.
  3. ECS places first 7 on running instance.
  4. After at least 3 of the 7 tasks complete, Jenkins runs remaining 3 in ECS.
  5. ECS places remaining 3 on same instance.

As you can see, no cluster auto scaling occurs because Jenkins cannot place tasks for which no instance capacity exists into the PROVISIONING status necessary for managed scaling to work.

This issue is similar to #138 (same cause, different effect). In order for the plugin to support cluster auto scaling, it must support capacity providers and do several things:

  1. Support capacityProviderStrategy as an alternative to launchType in task templates and use withCapacityProviderStrategy in the RunTask API request. This would support configurations where multiple capacity providers are available for use on the cluster.
  2. Allow the omission of BOTH launchType AND capacityProviderStrategy in task templates and construct the API request with NEITHER withLaunchType NOR withCapacityProviderStrategy so that the defaultCapacityProviderStrategy on the cluster takes over. This would support configurations where the default capacity provider strategy on the cluster is always desirable.
  3. Offload queuing of ECS build tasks to the cluster when launchType is null. This would ensure that Jenkins does not retain builds in its queue when capacity is not available and a capacity provider is handling scaling.
0reactions
stale[bot]commented, May 28, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Amazon ECS cluster Auto Scaling
Cluster Auto Scaling supports Launch Configuration, Launch Templates and multiple instance types in the capacity provider Auto Scaling group. You can also use ......
Read more >
How to Configure Auto-scaling for AWS ECS Service with ...
Click on Create cluster which will open the cluster creation wizard for you. This is where you can select if you want to...
Read more >
Deploy ECS Cluster Auto Scaling
With the cluster autoscaling being enabled, now the orchestrator will scale the backend infrastucture to meet the demand of the application. This empowers...
Read more >
Tutorial: Using cluster auto scaling with the Amazon Web ...
Prerequisites · Step 1: Create an Amazon ECS cluster · Step 2: Create the Auto Scaling resources · Step 3: Create a capacity...
Read more >
amazon-ecs-developer-guide/tutorial-cluster-auto-scaling ...
Whenever your application needs to scale out, Amazon EC2 Auto Scaling uses the pre-initialized instances from the warm pool rather than launching cold...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found