[QUERY] Distributed tracing Activity scope for batched Service Bus Functions
See original GitHub issueLibrary name and version
Azure.Messaging.ServiceBus 7.13.1 Microsoft.Azure.WebJobs.Extensions.ServiceBus 5.9.0
Query/Question
There currently exists discrepancies when processing Service Bus messages in either a batched versus singular fashion regarding Activity scoping and how the Functions consider incoming message Diagnostic-Id
properties.
In my scenario, one (or many) message producers are sending Service Bus messages to a given queue/topic and each of these messages may contain different Diagnostic-Id
s. The expectation here, is that this producer-provided trace ID is appropriately utilised within the Function when processing the message, and in turn is appropriately sent to any downstream components.
Now, when using a singular message Function (as below), this works perfectly.
public ProcessIncomingMessage([ServiceBusTrigger(...)] ServiceBusReceivedMessage message)
{
// Activity "ServiceBusProcessor.ProcessMessage"
// uses incoming `Diagnostic-Id` ✅
DoWork(message);
}
When using a batched message Function, however, incoming Diagnostic-Id
s are being somewhat “squashed” at this level; the Activity uses a new ID (which is being sent downstream), and the individual message trace IDs are added to Activity.Links
. As below:
public ProcessIncomingMessages([ServiceBusTrigger(...)] ServiceBusReceivedMessage[] messages)
{
// Activity "ServiceBusListener.ProcessMessages"
// uses new Id; incoming `Diagnostic-Id`s added to `Activity.Links` 🤔
foreach (var message in messages)
{
DoWork(message);
}
}
This seems backwards, I would expect that the incoming Diagnostic-Id
s be used as the “primary” Activity ID and the “batch” Activity is added as a Link (or equivalent).
Whether or not the message consumer processes messages in batch or singularly could be considered an implementation detail (at least in this scenario) and the message producer shouldn’t care; it would expect that the Diagnostic-Id
it provided in a given message is the trace ID used for any downstream actions performed in its processing.
Now, by this point it may seem obvious that such scoping is just not really feasible within a method written like this (with a ServiceBusReceivedMessage[]
argument), and the workaround/solution is to simply perform this context flip manually within the batch processor:
foreach (var message in messages)
{
using var activity = new Activity("ServiceBusProcessor.ProcessMessage");
// preserve reference to the batch Activity (optional)
activity.AddBaggage("BatchActivityId", Activity.Current.Id);
// use incoming `Diagnostic-Id` (if present)
if (message.ApplicationProperties.TryGetValue("Diagnostic-Id", out var value)
&& value is string diagnosticId
&& ActivityContext.TryParse(diagnosticId, null, out var parentContext))
{
activity.SetParentId(parentContext.TraceId, parentContext.SpanId, parentContext.TraceFlags);
}
activity.Start();
DoWork(message);
}
I guess my main question at this point is whether or not this would be the suggested approach; or whether there’s another way of implementing a batch-triggered Function such that the trace IDs are registered as I would expect them to be (per-message). The documentation I could find on this subject is rather limited.
Environment
.Net SDK 6.0.407
Issue Analytics
- State:
- Created 5 months ago
- Comments:19 (8 by maintainers)
Top GitHub Comments
I should clarify that the behavior to use links with batches applies to both ActivitySource (which is still experimental), and the GA DiagnosticSource support. However, it was influenced by the Open Telemetry spec. I will add a section into the guide that discusses the different behavior between batches/single messages.
Hi @jacobjmarks. Thank you for opening this issue and giving us the opportunity to assist. We believe that this has been addressed. If you feel that further discussion is needed, please add a comment with the text “/unresolve” to remove the “issue-addressed” label and continue the conversation.