[Singleton Pattern] Instances are going to Pending and some remain in Running State.
See original GitHub issueI need to execute messages with the same PersonId in Singleton fashion whereas I want to execute the ones with different PersonId in Parallel fashion. For this purpose, I am maintaining an ordered list of each PersonId in Redis Cache e.g. mycache:PersonId:1 will contain all messages for Person Id 1. If an instance with the same Person Id is already running then the azure queue message is ignored.
Testing went fine using azure-core-tools v2 on a local server. When deployed to Azure, sometimes messages are going to Pending state and some remain in Running forever. Sometimes, all instances go to pending state.
Why is this happening on Azure?
Given below is a similar sample of the code structure:
[FunctionName("PersonFunction_QueueStart")]
public static async Task QueueStart(
[QueueTrigger("person-queue", Connection = "connection-string")]PersonMessage personMessage,
[OrchestrationClient]DurableOrchestrationClient starter,
ILogger log)
{
//log.LogInformation(..); log some attributes
string instanceId = personMessage.PersonId;
log.LogInformation($"Checking if instance with instance ID {instanceId} already exists.");
var instance = await starter.GetStatusAsync(instanceId);
if (instance == null ||
instance.RuntimeStatus == OrchestrationRuntimeStatus.Completed ||
instance.RuntimeStatus == OrchestrationRuntimeStatus.Failed)
{
log.LogInformation($"PersonFunction instance with instance ID {instanceId} does not exist.");
await starter.StartNewAsync("PersonFunction", instanceId, personMessage);
log.LogInformation($"Started orchestration with ID = {instanceId}.");
}
else
{
log.LogInformation($"PersonFunction instance with Instance ID {instanceId} already exists.");
}
}
[FunctionName("PersonFunction")]
public static async Task RunOrchestrator(
[OrchestrationTrigger] DurableOrchestrationContext context, ILogger log)
{
log.LogInformation($"Executing PersonFunction orchestration with instance id {context.InstanceId}.");
var input = context.GetInput<personMessage>();
// Retrieve message from cache
var personMessage = await context.CallActivityAsync<PersonMessage>(
"PersonFunction_RetrieveMessage", input.PersonId);
if (personMessage == null)
{
// Person_Monitor inserts a message in queue of Monitor durable function
// which reads the cache after some seconds to see if any message is left.
// If any message in cache is left it inserts a message in person-queue.
await context.CallActivityAsync<Task>("Person_Monitor", input);
}
else
{
// Execute some stored procedure
var result = await context.CallActivityAsync<PersonMessageResult>("PersonFunction_ExecuteProcedure",
personMessage);
// Left pop message from reddis
await context.CallActivityAsync<personMessage>(
"PersonFunction_LeftPopMessage", input.PersonId);
context.ContinueAsNew(input);
}
log.LogInformation($"PersonFunction orchestration with instance id {context.InstanceId} executed successfully.");
}
Nuget Packages:
- Microsoft.Azure.WebJobs.Extensions.Storage 3.0.3
- Microsoft.NET.Sdk.Functions 1.0.24
- Microsoft.NETCore.App 2.2.0
- Newtonsoft.Json 11.0.2
- StackExchange.Redis 2.0.519
- System.Data.SqlClient 4.6.0
Issue Analytics
- State:
- Created 5 years ago
- Comments:11
Top Results From Across the Web
swift - Resetting singleton state when testing a module
If his singleton is "it's stuck in a state" for duration of testing, it will be stuck the same way while running for...
Read more >Manage instances in Durable Functions - Azure
Orchestrations in Durable Functions are long-running stateful functions that can be started, queried, suspended, resumed, and terminated using ...
Read more >Best practice: Singleton vs factory + static cache
Singleton is a pattern that is used for having global state objects without having them pending in the global scope. One of its...
Read more >Java Singleton Design Pattern Best Practices with Examples
Singleton pattern restricts the instantiation of a class and ensures that only one instance of the class exists in the Java Virtual Machine....
Read more >Avoiding Singleton Abuse
The lesson here is that singletons should be preserved only for state that is global, and not tied to any scope. If state...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Thanks @tehmas. I found your orchestration and confirm that it has gotten into a bad state. I see that you’re using the
ContinueAsNewpattern, but I’m also detecting oddities. For example, it looks like multiple singletons are being created with the same name and around the same time. One such time is 2019-02-21 07:05:14.8686352 - three instances of your singleton were created concurrently.It seems to me that you’re running into this issue: https://github.com/Azure/azure-functions-durable-extension/issues/612
Looking at your
PersonFunction_QueueStartmethod, there is a race condition where two queue messages processed at the same time could cause two singletons to be created at the same time. This appears to be the source of the corruption. To fix it, you’ll need to use some form of locking (for example, a blob lease or the[Singleton]WebJobs attribute) to prevent this from happening. In the meantime, we’re looking into ways to make theStartNewAsyncAPI safe for multi-threaded use.Unfortunately I don’t have any insights or expertise on Redis, so I can’t comment on how that might be impacting your function app.
I meant that starting at 14:17 I see a large gap in activity. Followed by new activity in a new (recycled) process. I broke it down further so you can see in more detail:
My takeaway here is that your activity function somehow got into a hung state and therefore exceeded the 5-minute timeout. I can’t explain why it hung because that seems to be somewhere in your application logic. I think the next step for you would be to figure out why your code is occasionally hanging. It could be a Redis issue, or (more likely IMO) it could be a deadlock somewhere in your code or in the SDK you’re using. In either case, you may need to get a Redis specialist involved.
Based on this analysis, I don’t think there’s actually an issue with the Durable Functions extension, so I’ll go ahead and close this issue. Do let me know if you find something that makes you think otherwise and we can re-investigate.