[BUG] Service Bus reconnection not behaving as expected
See original GitHub issueLibrary name and version
Azure.Messaging.ServiceBus (7.6.0)
Describe the bug
We have noticed we sometimes get exceptions from Service Bus that look something like the following
ScheduleMessageAsync Exception: Azure.Messaging.ServiceBus.ServiceBusException: The link 'G23:RR:213866011:637468380794500000:<queue-name-here>$management:103182:sender' is force detached. Code: ServerError. Details: AmqpControlProtocolClient.Fault. TrackingId:4dbabb40-1cc1-4165-84d9-208b5d2968ef_B35, SystemTracker:<queue-name-here>, Timestamp:2022-05-28T07:21:30 Reference:3bc9b3fb-f97a-4f20-8106-68e709da3b08, TrackingId:1685d478-88e8-4bfd-b2a6-7093948b2e31_G23, SystemTracker:NoSystemTracker, Timestamp:2022-05-28T07:21:30 (GeneralError)
at Azure.Messaging.ServiceBus.Amqp.AmqpSender.ScheduleMessageInternalAsync(IReadOnlyList`1 messages, TimeSpan timeout, CancellationToken cancellationToken)
at Azure.Messaging.ServiceBus.Amqp.AmqpSender.ScheduleMessageInternalAsync(IReadOnlyList`1 messages, TimeSpan timeout, CancellationToken cancellationToken)
at Azure.Messaging.ServiceBus.Amqp.AmqpSender.<>c.<<ScheduleMessagesAsync>b__24_0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at Azure.Messaging.ServiceBus.ServiceBusRetryPolicy.RunOperation[T1,TResult](Func`4 operation, T1 t1, TransportConnectionScope scope, CancellationToken cancellationToken)
at Azure.Messaging.ServiceBus.ServiceBusRetryPolicy.RunOperation[T1,TResult](Func`4 operation, T1 t1, TransportConnectionScope scope, CancellationToken cancellationToken)
at Azure.Messaging.ServiceBus.Amqp.AmqpSender.ScheduleMessagesAsync(IReadOnlyList`1 messages, CancellationToken cancellationToken)
at Azure.Messaging.ServiceBus.ServiceBusSender.ScheduleMessagesAsync(IEnumerable`1 messages, DateTimeOffset scheduledEnqueueTime, CancellationToken cancellationToken).
I did some light searching and couldn’t seem to find any information on the most meaningful parts of the error. Once we see this error it seems the client gets into a state where it will no longer send any messages, and we usually have to restart the service. Even if we leave it for hours, it never seems to reconnect.
The documentation for Service Bus seems to suggest that reconnection should be handled automatically by the library so we were unsure what the recommended way of handling this occurrence is, or how to programmatically spot it to know when to try something more forceful.
Expected behavior
After server failure, client library should reconnect
Actual behavior
After server failure, client library does not reconnect
Reproduction Steps
Unfortunately I do not have any. The problem seems intermittent and caused by something server side.
Environment
- .NET Core 3.1
- Docker container (mcr.microsoft.com/dotnet/aspnet:3.1)
- Running in AKS
If any more detailed information is required I am happy to provide it but I don’t know what would be useful.
Issue Analytics
- State:
- Created a year ago
- Comments:16 (8 by maintainers)
Top GitHub Comments
@JoshLove-msft apologies for the late reply, had bank holidays here in UK.
Yes. Exceptions are as follows:
I’ll have to see about the logging, is there any way I can narrow it to just the necessary logs? We have a lot of services and with this being somewhat transient I’d need to roll it out in many places.
Namespace is
wk-sbx01-westeurope-servicebus
and the queue name wassandbox.matching.commands
in the examples I gave.Thank you for your feedback. Tagging and routing to the team member best able to assist.