Lease token was taken over by owner exception
See original GitHub issueSome of our change feed processors are stuck and we’re seeing some OperationCanceledException
as well as these types of CosmosException
s in the logs:
Response status code does not indicate success: PreconditionFailed (412); Substatus: 0; ActivityId: ; Reason: (796 lease token was taken over by owner something-6c082e98-54b1-4fe9-9486-fc51ce2be403
What does it mean for a lease token to be taken by another owner? I was under the impression that a single lease is owned by a single compute instance. The change feed processors also seem to start running again from time to time and then halt again seemingly randomly.
Our configuration involves:
- A single monitored container
- Multiple change feed processors doing different things
- All of them use the same lease configuration
- Multiple instances of the processors on multiple hosts where every instance name is postfixed by a guid so it has a unique name
Issue Analytics
- State:
- Created 6 months ago
- Comments:18 (11 by maintainers)
Top Results From Across the Web
Azure cosmos changefeed Processor options
1 Answer. Leases when not renewed are not removed by the current instance. Other instances can "think" that the lease was not renewed...
Read more >Change feed processor in Azure Cosmos DB
The lease container: The lease container acts as state storage and coordinates processing the change feed across multiple workers. The lease ...
Read more >Lease, Renew, and Revoke | Vault
When a token is revoked, Vault will revoke all leases that were created using that token. Note: The Key/Value Backend which stores arbitrary...
Read more >SCHD
SCHD0003E: An error occured while starting the scheduler service: {0}. ... SCHD0061E: The task information for task ID {0} and owner token {1}...
Read more >This Shopping Center Lease Agreement (the
Landlord on behalf of and as agent for the owner of the Shopping Center hereby leases to Tenant and Tenant leases and accepts...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Here’s what I found. The issue is indeed on our side and it occurred when I was migrating to Cosmos DB v3. I’ll describe it here for future reference.
We have manual checkpointing logic with a lot of other things built on top of change feed processors. In our v2 code this is how things would play out during an unhandled exception:
ProcessChangesAsync
method gets called again and this field is set, it logs that it’s in a faulted state and throws it again.When an exception occurs in our initial v3 code:
Now any instance that had an exception would halt and perpetually retry acquiring a lease and attempting to process. Ergo infinite “Lease was taken over by owner” logs.
So for us what used to be instances of
IChangeFeedObserver
that get dumped every time they have an unhandled exception (v2), were now delegates that continued to be reused in a perpetual faulted state (v3).I just removed the parts about keeping the exception in a private field.
Thank you for all the help! I wouldn’t have been able to find my mistake without it.
You can close this issue.
@ealsur We’re investigating this but for the time being I do not believe it to be related to Cosmos DB v3. We have a bunch of things built on top of the change feed processor SDK and some of them like the batching is probably the crux of it. Soon as I get more understanding of what’s happening I’ll close this issue 😃