question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Orchestration getting stuck while getting the lock

See original GitHub issue

Description

The full description with code snippets, screenshots, issue samples, etc is here https://github.com/Azure/azure-functions-durable-extension/discussions/2530

Expected behavior

Acquire the lock in seconds at most, not minutes or hours.

Actual behavior

It seems the orchestration is getting stuck while acquiring the locks of the entities intervening in the orchestration.

Known workarounds

Reset the durable storage account and the func app storage account

App Details

  • Durable Functions extension version (e.g. v1.8.3): 2.10.0
  • Azure Functions runtime version (1.0 or 2.0): 2
  • Programming language used: C#

If deployed to Azure

  • Timeframe issue observed:
  • Function App name: orders-saga
  • Azure region: West-US
  • Azure storage account name: ordersagav2

Issue Analytics

  • State:open
  • Created a month ago
  • Comments:32 (14 by maintainers)

github_iconTop GitHub Comments

1reaction
nytiancommented, Aug 21, 2023

Hi, sorry for the late response. We are still working on to identify the root cause.

So first for the Azure Storage Backend with the old partition manager, with the orchestration instance Id you provided, the issue seems like the partition of control queue 01 couldn’t be handed over to another worker for several hours. @davidmrdavid is working on creating a private package to mitigate this temporally in this case and we will give the private pkg to you tomorrow hopefully.

Since the issue hits on both versions of partition manager, we don’t know if the above root cause is the same in the new partition manager. So, could you provide us any orchestration instanced/ TaskHub/ TimeStamp, that hit on this issue with the partition manager V3? Even if the storage account being deleted is fine, since we will keep the Kusto logs in a separate storage account. That would be helpful for me to identity the cause in the partition manager V3 scenario.

1reaction
vany0114commented, Aug 8, 2023

It’s not intermittent, it was happening basically with every request processed by the orchestrator, here are two more examples where those instances were stuck for hours.

  • {"id":"3107849517381639","businessDate":"2023-08-07T00:00:00","locationToken":"5zAQ1KZzYkqmea9XbBkLbA=="} This was suck for more than 10 hours. image

  • {"id":"3107849515304964","businessDate":"2023-08-07T00:00:00","locationToken":"5zAQ1KZzYkqmea9XbBkLbA=="} and this one more than 6 hours image

As a result of that our processed orders rate was affected, you can see here how dramatically it dropped because the orchestrator was holding almost all of them. image

@davidmrdavid please let me know if the information provided helps

Read more comments on GitHub >

github_iconTop Results From Across the Web

Durable entity stays locked · Issue #1325
The context/entity locks, fails mid-way and never unlocks. Subsequent invocations halt because the lock is held. The runtime logs messages such ...
Read more >
Clients stuck in Lock State:0 : r/SCCM
Then we were trying to Clear deployment locks and moving the clients out of the collection to see if this helps (nope). What...
Read more >
Studio Troubleshooting - Operations Orchestration
When you are connected to SCM and "Enforce locking" is enabled, after you move a folder several times and then back to its...
Read more >
Custom Naming workflow is hanging, stuck on obtaining Lock
If a previous lock didn't get cleaned up due to a failure, the next workflow run will hang, waiting to obtain a new...
Read more >
Locked items appear to remain locked after commit
Though the commit should automatically release the locks, it can be resolved by: ... Micro Focus uses cookies to ensure you get the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found