question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Long delay between starting and triggering orchestration function

See original GitHub issue

Describe the bug I have two functions:

[FunctionName("TriggerViaHttp")]
public static async Task<HttpResponseMessage> TriggerViaHttpAsync(
    [HttpTrigger(AuthorizationLevel.Anonymous, "GET", Route = "start")] HttpRequest request,
    [OrchestrationClient] DurableOrchestrationClient orchestrationClient,
    ExecutionContext executionContext,
    CancellationToken cancellationToken)
{
    await orchestrationClient.StartNewAsync("StartOrchestration", new { });

    return new HttpResponseMessage(HttpStatusCode.OK);
}


[FunctionName("StartOrchestration")]
public static Task StartOrchestrationAsync(
    [OrchestrationTrigger] DurableOrchestrationContext context,
    ExecutionContext executionContext,
    CancellationToken cancellationToken) => Task.CompletedTask;

The problem is in a big delay between execution of orchestrationClient.StartNewAsync and actual execution of StartOrchestration function.

Investigative information

  • Durable Functions extension version: 1.7.1
  • Function App version: 2.0
  • Programming language used: C#

To Reproduce

  1. Send GET /start request

Expected behavior StartOrchestration function is triggered in milliseconds after sending this request.

Actual behavior Sometimes StartOrchestration function is triggered after significant amount of time like several seconds (I can even see the messages in %taskHubName%-control queue in the storage that were not consumed immediately)

Additional context

I can reproduce such behavior only if I have another non-durable function deployed to the same function app. Is that even correct to have durable and non-durable functions in the same function app? I can’t find anything about it in the documentation.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:3
  • Comments:11 (6 by maintainers)

github_iconTop GitHub Comments

18reactions
myarotskayacommented, Feb 13, 2019

I get the point about the cold start that it might be up to 30 sec and I think it’s something we can live with.

But what I described (6 sec delay) looks like different case (not a cold start) since I can see from logs that instance c5ebb91e036a688675d9c1e68ffd5bf888fffe4ddd19de33f4ad2a13b18f9128 was active like 14 sec before the last (delayed) execution.

Real use-case

I use durable functions for a chat system. The design of this system is the same as in repository I shared in previous comments - there is a Http trigger to initiate sending a message to chat and an activity function that actually sends this message via web sockets. Users of this chat are experiencing the following behavior (consider we made some API calls beforehand to prevent a cold start):

  1. User sends a message - message is delivered in 100 ms
  2. User sends a message - message is delivered in 150 ms
  3. User sends a message - message is delivered in 6 sec
  4. User sends a message - message is delivered in 4 sec
  5. User sends a message - message is delivered in 90 ms
  6. User sends a message - message is delivered in 5 sec

etc. so even after a dozens of http requests they still experience these delays.

The long delivery time in some cases (several seconds) is caused by not immediate start of the orchestration. So it looks unacceptable to have such delays for this use case.

From what are you saying I understand that these random delays are caused by backoff polling of the queues in the Durable Task hub to avoid huge storage account charges and that I can avoid polling using only one instance. But in my case I definitely need more than one instance (to make the solution scalable) as well as quick orchestration starting. Maybe there are some configurable settings for this polling period?

6reactions
myarotskayacommented, Feb 22, 2019

Hi @cgillum,

I have been working on this case with Azure Support engineers. They confirmed that these ~6 seconds delays happened quite frequently and that they are cased not by the cold start. The resolution of this ticket is:

As I checked with our internal team the Durable Function is mostly useful to define stateful workflows. It can manages state, checkpoints, and restarts for you. And it may be more suitable for the system which not require real time update very sensitive.

So I wonder, are Durable Functions really supposed to be used only in the systems that are ok with several seconds delays? If it’s true than it should be reflected in the documentation somewhere (or maybe it is already?) since it can save quite amount of time for developers like me who don’t consider that such delays are even possible.

And another question: maybe there is any chance that we can decrease polling period to prevent these delays (I understand that it increases storage account charges but still)? I just don’t get how come that regular Azure Functions trigger on queue storage messages almost immediately but Durable Functions are not.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Delay time between Activity increasing when using Azure ...
Under certain conditions, you may observe multi-second delays between when an orchestration is scheduled to run and when it starts running.
Read more >
Durable Functions Troubleshooting Guide - Azure
Orchestration starts after a long delay. Normally, orchestrations start within a few seconds after they're scheduled. However, there are certain ...
Read more >
Timers in Durable Functions - Azure
Durable Functions provides durable timers for use in orchestrator functions to implement delays or to set up timeouts on async actions.
Read more >
Azure Durable Functions
Starter Function : Simple Azure Function that starts the Orchestration by calling the Orchestrator function. It uses an OrchestrationClient binding.
Read more >
Using Azure Function App with in a near real time system
One quirky observation we noticed was randomly there would be a 30s wait between a trigger being fired and when the orchestration would...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found