QueueTrigger stops triggering - how to troubleshoot?
See original GitHub issueHere is the scenario.
We have one WebJob with 2 functions using QueueTrigger
- first function is responsible for generating mail content (polling mail-generator queue)
- second function is sending the generated email (polling mail queue)
The first function outputs a message to the mail queue using as shown below.
public async Task ProcessMailGeneratorQueue([QueueTrigger("%AzureSettings:MailGeneratorQueueName%")] string message, ILogger logger, ExecutionContext context, [Queue("%AzureSettings:MailQueueName%")] CloudQueue outputQueue)
{
MailNotificationModel model = JsonConvert.DeserializeObject<MailNotificationModel>(message, new MailNotificationModelConverter());
logger.LogInformation($"Generating {model.ReferenceKey}");
var email = await _mailGenerator.PrepareMailMessage(model);
CloudQueueMessage queueMessage = new CloudQueueMessage(JsonConvert.SerializeObject(email));
await outputQueue.AddMessageAsync(queueMessage);
}
This has been working well for several months. However in the past ~5 weeks ago, we have had several occurrences when the 2nd function is not triggering anymore while the queue contains many messages.
If we restart the web job from the portal (i.e. Stop, Wait for it to stop and Start), the function works properly again.
At this stage, we are not sure how we can further troubleshoot the problem as we have no error traces (we are using AppInsights). I am therefore looking for some advice as how we can further troubleshoot this issue. Is there something we should look after? or something we should add to our logging?
Any help or idea is welcome as this is really annoying and makes the whole solution useless if we cannot rely on it.
Related information
Our volume for this WebJob remain quite “small”. We are not talking about thousands of messages, nor even hundreds. At peak time, we could have couple of dozens of messages per minute.
We are using the following package versions
<PackageReference Include="Microsoft.Azure.WebJobs" Version="3.0.3" /> <PackageReference Include="Microsoft.Azure.WebJobs.Extensions" Version="3.0.1" /> <PackageReference Include="Microsoft.Azure.WebJobs.Extensions.Storage" Version="3.0.2" /> <PackageReference Include="Microsoft.Azure.WebJobs.Logging.ApplicationInsights" Version="3.0.3" /> <PackageReference Include="Microsoft.Extensions.Caching.Memory" Version="2.2.0" /> <PackageReference Include="Microsoft.Extensions.Configuration" Version="2.2.0" /> <PackageReference Include="Microsoft.Extensions.Configuration.CommandLine" Version="2.2.0" /> <PackageReference Include="Microsoft.Extensions.DependencyInjection" Version="2.2.0" /> <PackageReference Include="Microsoft.Extensions.Logging.Console" Version="2.2.0" /> <PackageReference Include="WindowsAzure.Storage" Version="9.3.3" />
Work around
At this point, only workaround is to stop/start the WebJob in order for it to dequeue again and we have created a function that retrieves the number of messages in queue and alert us when the number of messages reaches a certain threshold (25 in the last 5 minutes)
Issue Analytics
- State:
- Created 5 years ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
Note that we’ve also just added more logs to queue triggers (#2277) – when this is released to nuget (which should be fairly soon), please give it a try to see if the logs it generates are helpful. You’ll need to enable Debug-level logs to get them to show up, but the logs will write every time we poll the queue and every time we delay a poll.
As for what storage logs are useful – you can enable diagnostic logs to be written to the
$logs
blob container on the same storage account. This will tell us everytime we attempted to read/write/delete a queue message. You can find those settings here (note the “Logging” section):I’m going to close this as I believe #2277 will help diagnose this problem. If not, or if you get more details, please feel free to re-open and we can take another look.