[QUERY] concurrency in batching
See original GitHub issueLibrary name and version
Azure.Messaging.ServiceBus 7.13.1
Query/Question
The following methods seem to be extremely slow:
(await sender.CreateMessageBatchAsync(cancellationToken)).TryAddMessage(...)
- SendMessagesAsync
In a scenario like below - using a recursion to build a list of safe batches to send in parallel.
public async Task SendBatchesAsync(
string topicOrQueueName,
List<MessageBatch> messages,
CancellationToken cancellationToken = default)
{
...
// illustrative purpose only - client and sender are DI'ed from container
var clientOptions = new ServiceBusClientOptions
{
TransportType = ServiceBusTransportType.AmqpWebSockets
};
var client = new ServiceBusClient(
"xxxxxx.servicebus.windows.net",
new DefaultAzureCredential(),
clientOptions);
var sender = client.CreateSender(topicOrQueueName);
var result = await CreateBatchesSender(
sender,
messages,
new List<ServiceBusMessageBatchWrapper>(),
new ServiceBusBatchResult() { Failure = new ServiceBusSendBatchException("batch insert failures") },
indexPointer: 0,
cancellationToken);
// Calling DisposeAsync on client types is required to ensure that network
// resources and other unmanaged objects are properly cleaned up.
await sender.DisposeAsync();
await client.DisposeAsync();
if (!result.AllSucceeded)
{
throw result.Failure;
}
}
private async Task<ServiceBusBatchResult> CreateBatchesSender(
ServiceBusSender sender,
List<MessageBatch> messages,
List<ServiceBusMessageBatchWrapper> batches,
ServiceBusBatchResult result,
int indexPointer,
CancellationToken cancellationToken)
{
var currentList = messages.Skip(indexPointer).ToList();
if (currentList.Count == 0)
{
return result;
}
// Start sending the previous batch whilst the new one is building
Task execSend = batches?.Count > 0 ? sender.SendMessagesAsync(batches?.Last().Batch, cancellationToken) : Task.FromResult(true);
int index = indexPointer;
_logger.LogDebug("creating batcher...");
var customBatch = new ServiceBusMessageBatchWrapper
{
Batch = await sender.CreateMessageBatchAsync(cancellationToken),
};
_logger.LogDebug("finished creating batch sender, starting to process...");
for (int b = 0; b < currentList.Count; b++)
{
if (!customBatch.Batch.TryAddMessage(BuildServiceBusMessage(currentList[b].Payload, currentList[b].Context)))
{
// do not increment index
_logger.BatchSizeExceeded(index);
break;
}
customBatch.AddIndex(index);
index++;
}
batches.Add(customBatch);
return await CreateBatchesSender(
sender,
messages,
batches,
await CaptureException(execSend, batches?.Last()?.Indexes, result),
index,
cancellationToken);
}
internal static ServiceBusMessage BuildServiceBusMessage(
string payload,
IMessageContext? messageContext)
{
var serviceBusMessage = new ServiceBusMessage(new BinaryData(payload))
{
ContentType = ApplicationJson,
CorrelationId = messageContext?.CorrelationId ?? "",
Subject = messageContext?.Label,
ScheduledEnqueueTime = messageContext?.ScheduledEnqueueTimeUtc ?? DateTimeOffset.UtcNow
};
serviceBusMessage.AddApplicationProperties(messageContext?.CustomProperties);
return serviceBusMessage;
}
/// <summary>
/// Executes and captures any exception from the background operation
/// </summary>
/// <param name="task"></param>
/// <param name="indexes"></param>
/// <param name="result"></param>
/// <returns></returns>
internal static async Task<ServiceBusBatchResult> CaptureException(Task task, List<int> indexes, ServiceBusBatchResult result)
{
try
{
await task;
}
catch (Exception ex)
{
result.Failure.Exceptions.Add((indexes, ex));
result.AllSucceeded = false;
}
return result;
}
From looking at the app insights dependency analysis it seems TryAddMessage
is not extremely fast, and more importantly the await sender.CreateMessageBatchAsync(cancellationToken)
takes anywhere between 1 - 3 seconds.
If you are looping through 18k items that are split into 62 batches this is obviously run 62 times which would on its own contribute to ~2mins run time.
the sender is created above which is scoped to a specific queue, and it is my understanding that at this point the AMQP connection is established and just re-used - in. this case inside the recursive call.
For reference the creation of a List<T> of 18k with serialization of the “payload” into a string takes sub 1s as expected.
The same series of operations in the Go SDK takes about ~5s for a 50k file which also includes a download from Blob of that file.
Any thoughts/pointers would be welcomed.
PS: some implementations included returning a list of safe batches and then Task.WhenAll(sender.SendMEssagesAsync(...)
but this was timing out after a minute…
PS2: AWS/GCP both include an errorList in their response for batched operations - maybe something like this could be added here.
Environment
.NET SDK:
Version: 7.0.201
Commit: 68f2d7e7a3
Runtime Environment:
OS Name: Mac OS X
OS Version: 12.6
OS Platform: Darwin
RID: osx.12-x64
Base Path: /usr/local/share/dotnet/sdk/7.0.201/
Host:
Version: 7.0.3
Architecture: x64
Commit: 0a2bda10e8
VSforMac 17.5.1 (build 23)
Issue Analytics
- State:
- Created 6 months ago
- Comments:6 (2 by maintainers)
Top GitHub Comments
Hi @dnitsch. Thank you for the additional context. I think that I now understand where some issues may be, though we’d need to capture SDK logs for a 5-minute time slice around the issue to be sure.
When you’re calling
SendMessagesAsync
on a single sender, you’re attempting to transmit each batch over the same AMQP link. Though you can make the calls concurrently, each operation on a link is queued, allowing only one outstanding send, ending when the service acknowledges receipt. Each operation uses the TryTimeout configured on the client to govern how long it is allowed to remain in an active state. By default, this is 60 seconds; unless all of your sends complete within that time, they will fail with a timeout andTask.WhenAll
will see a faulted task, causing it to throw.With 16 virtual cores, the host machine can perform only 16 operations concurrently. Starting 62 concurrent tasks means that you’ll potentially see continuations for async operations getting queued and waiting to resume. Since there is no fairness in scheduling, even when the system is lively and continuing to make forward progress, some tasks may end up running longer than their timeout while waiting to be scheduled. Scenarios that trigger retries, such as throttling and transient failures, may exacerbate this.
Yes, each
SendAsync
call - regardless of whetherServiceBusMessageBatch
is used or not - is atomic. All messages will either succeed or fail as one unit. However, it is important to note that this is NOT true of theTask.WhenAll
that you’re using. Each of thoseSendAsync
tasks is independent and will succeed/fail atomically but you have potential for partial success across tasks.Recommendations
Take a peek at Best Practices for performance improvements using Service Bus Messaging, which discusses some high-level considerations around Azure resources.
Consider the amount of concurrency that you need against the available resources in the host. It varies greatly by the application, host environment, and workload. The current 4:1 ratio of tasks to virtual core may or may not be ideal for achieving the throughput that you’re looking for. We generally recommend starting with a 2:1 ratio and testing under real-world conditions to find the
If you’re intending to perform concurrent sends, then you’ll want to create the same number of senders as the degree of concurrency that you’d like. Each will create a dedicated AMQP link, allowing them to transmit concurrently - within limits of the network and service. Depending on how many degrees of concurrency you select, you may want to create additional
ServiceBusClient
instances to spread sends out across connections for better throughput. Since throughput will vary depending on a number of factors, it is recommended that you test with your application to discover the best balance.I’m going to mark this as addressed, but please feel free to unresolve if you’d like to continue the discussion.
thanks @jsquire - nice and informative.
I’ll re-paste the link from above as the hyperlink has a typo in it Best Practices for performance improvements using Service Bus Messaging
Here is the specific section talking about creating multiple senders, which, like me, people may have missed 😄