question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Consumption Plan Scaling Issues

See original GitHub issue

I am attempting to measure some performance metrics of Azure Functions and other serverless platforms, and am having trouble getting a test function to scale in a Consumption plan. I am creating a Node.js function that busy-waits in a loop for 1 second, and then exits. Then, I am triggering that function once per second on a new Consumption plan. I continue this request rate for five minutes. Here is a graph of one of these tests, where the vertical axis is the response time minus the function duration (1 second), and the horizontal axis is the number of seconds into the test:

scaling

The idea is that a few of these requests come in while the first instance is being assigned to the function, and then the instance tries to play catch-up with the requests (to no avail because the instance is receiving 1 second of work per second). It seems like either these unhandled requests and/or CPU load (from the busy-waiting) on the instance should trigger scaling, but I’m not seeing any additional instances added to the function. Is there something I’m doing wrong here?

Repro steps

  1. Create function app inside an App Service with a Consumption plan

  2. Deploy the following function with an HTTP trigger to the function app:

'use strict';

var id;

module.exports.test = function (context, req) {
  var start = Date.now();

  if (typeof id === 'undefined') {
    id = uuid();
  }

  while (Date.now() < start + req.body.duration) {}

  context.done(null, { body: {
    duration: Date.now() - start,
    id: id
  }});
};

function uuid(a) {
  return a?(a^Math.random()*16>>a/4).toString(16):([1e7]+-1e3+-4e3+-8e3+-1e11).replace(/[018]/g,uuid);
}
  1. Trigger this function once per second with the following request body:
{
    "duration": 1000
}
  1. Collect the response times of the function under the 1 request per second load for a few minutes.

  2. Observe that the returned id does not change, and the response times are consistently longer than 1 second.

Expected behavior

The first request begins execution somewhere between 4 and 30 seconds (I’ve been seeing a large range in cold execution response times, is this typical?) after the start of the test, presumably because there were no instances assigned to the function and it takes some time to add the first instance. It seems like a second instance could be assigned to the function in a similar amount of time (at which point the function can complete 2 seconds of work per second), and then the response times will linearly decrease to near 0. Am I wrong about the behavior I suspect, and if so, what should I be seeing scalability-wise?

Actual behavior

Described above. This behavior is contrasted by running the same experiment again, except without any waiting by sending this as the request body:

{
    "duration": 0
}

Which results in a graph with a similar cold start latency, but is easily able to catch up to the requests because they are not causing work:

scaling_better

Known workarounds

Hoping to find one.

Related information

Let me know if you want me to provide any additional information.

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Reactions:9
  • Comments:36 (17 by maintainers)

github_iconTop GitHub Comments

4reactions
davidebbocommented, Apr 12, 2017

I wonder whether there could be an option to set the maximum number of concurrent function calls per instance?

Yep, we are adding this very option! See https://github.com/Azure/azure-webjobs-sdk-script/wiki/Http-Functions#throttling for details. This should be available by the middle of next week.

3reactions
Gorthogcommented, Oct 30, 2017

I created an Azure Function that busy waits 30 ms. I created a simple load test that creates an http request every 20 ms, for a total of 1000 requests. The results are simply horrible! I see average of almost 30 seconds per request. After limiting the concurrent http requests in host.json to 1 (!) I managed to get to 7-10 seconds per requests. As far as I can tell, scaling does not work at all in my case - I see a single instance ID. When running the same test with 200 ms delay, I get what I expect: an average of ~190 ms per call.

Is there a workaround to make scaling work with http trigger on consumption plans?

Here is my Azure Function:

static ILogger log;
        [FunctionName("DataQuery")]
        public static async Task<HttpResponseMessage> Run(
            [HttpTrigger(AuthorizationLevel.Anonymous, "post", Route = null)]
            DataQueryRequest dataQueryRequest,
            ILogger _log)
        {
            var stopper = new Stopwatch();
            stopper.Start();
            BusyWait(30);

            var resultObj = JObject.FromObject(new
            {
                Duration = stopper.ElapsedMilliseconds,
                InstanceId = Environment.GetEnvironmentVariable("WEBSITE_INSTANCE_ID", EnvironmentVariableTarget.Process),
            });

            return new HttpResponseMessage(HttpStatusCode.OK) { Content = new StringContent(resultObj.ToString(), System.Text.Encoding.UTF8, "application/json") };            
        }

private static void BusyWait(int ms)
        {
            var stopper = new Stopwatch();
            stopper.Start();
            do
            {
                // wait
            }
            while (stopper.ElapsedMilliseconds < ms);
        }
Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshooting Azure Function Consumption Plan Scaling
I recently encountered an issue with an Azure Function Event Hub trigger running on a Linux Consumption Plan, whereby events stopped being ...
Read more >
Event-driven scaling in Azure Functions
In the Consumption and Premium plans, Azure Functions scales CPU and memory resources by adding more instances of the Functions host.
Read more >
Azure Functions scalability issue
The Consumption plan scales automatically, even during periods of high load. ... The Azure Functions Elastic Premium plan is a dynamic scale ......
Read more >
Scaling Azure Functions from Consumption Plan to ...
I just wanted an easy way to scale my 12 function apps up to premium plans for a couple hours, then scale them...
Read more >
How we improved the performance of our Azure functions with ...
The benefit of functions using a consumption plan is that the functions can scale to zero. Meaning when there is no traffic to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found