question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] ServiceBusTrigger - lock renewals stop suddenly/randomly

See original GitHub issue

Library name and version

Microsoft.Azure.WebJobs.Extensions.ServiceBus 5.11.0

Describe the bug

I have a WebJobs project with continuous jobs and Service Bus trigger (topic). The trigger/function runs for quite some time (up to 40 minutes) and after some time, the calls to renew the lock on the message (PeekLock mode) stop being made. This then leads to failures when the function auto-completes (attempts to) the message. MaxAutoLockRenewalDuration is not exceeded as per the project attached.

This is especially hard to deal with because all the work carried out by my function completes but the message goes back to the queue and is re-processed in a retries loop. You can see the output from AppInsights (in Rider) in the screenshot below.

image

Expected behavior

The message lock should be renewed. Auto-complete should successfully complete the message.

Actual behavior

The message lock is not renewed. Auto-complete fails to complete the message.

Reproduction Steps

You’ll need to set up Service Bus in appsettings.json and create the required messaging entities (check TestTrigger.cs). It takes a while - I’ve seen this 5-6 times in the last 2 days when running this project locally. ServiceBusLockLostRepro.zip

Environment

Hosting: Azure AppService (but the same can be reproduced locally, running against the below

.NET SDK:
 Version:   7.0.203
 Commit:    5b005c19f5

Runtime Environment:
 OS Name:     Windows
 OS Version:  10.0.19044
 OS Platform: Windows
 RID:         win10-x64
 Base Path:   C:\Program Files\dotnet\sdk\7.0.203\

Host:
  Version:      7.0.5
  Architecture: x64
  Commit:       8042d61b17

The issue is IDE-agnostic. I ran it from within Rider as well as using CLI.

Issue Analytics

  • State:closed
  • Created 3 months ago
  • Comments:20 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
jsquirecommented, Jul 6, 2023

Hi @pzaj2. To be clear - if this does root cause to a failure for locks being renewed, then it is absolutely a client bug that we should fix.

However, if it is the result of intermittent network issues causing the connection or link to drop, there’s nothing that your application or the client can do to directly prevent it, unfortunately. It is something that would need to be mitigated by ensuring that the application’s processing is idempotent and can ignoring duplicate data.

Thus far, we haven’t been able to repro and are not seeing stress test failures for this scenario. Logs are going to be our best bet, assuming that you’re able to repro.

The long-term solution is for Service Bus to support AMQP’s durable terminus, which allows for link state to be persistent across connections. Once the service has support, we’ll add it to the client which would mitigate the “I lost my connection and now my lock is invalid” scenario. I do not have insight into the timing for the service feature, however, only that it is on the roadmap.

0reactions
JoshLove-msftcommented, Aug 3, 2023

Closing this out as I haven’t been able to repro and we have made some improvements to make lock lost issues less likely to occur.

Read more comments on GitHub >

github_iconTop Results From Across the Web

ServiceBusTrigger - Automatic Lock renew bug · Issue #724
"If the function runs longer than the PeekLock timeout, the lock is automatically renewed." True, although if the default Lock Duration is ...
Read more >
How to Auto Renew Message Timeout using ...
The function renews the message lock by itself. You don't need to renew the lock manually, it is handled by the run time...
Read more >
Azure Service Bus message transfers, locks, and settlement
The Complete , DeadLetter , or RenewLock operations may fail due to network issues, if the held lock has expired, or there are...
Read more >
Lock Exception Error in Azure Service Bus - Microsoft Q&A
Issue is - My message is processed but I am facing - The lock supplied is invalid. Either the lock expired, or the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found