question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Calling broker.shutdown or subscription.cancel waits indefinitely if connection is not healthy

See original GitHub issue

If connection to RabbitMQ is lost (reconnecting), calling broker.shutdown() or any of the subscription.cancel() waits indefinitely if connection is not healthy.

I’ve discovered this issue in a k8s environment. Basically the readiness probe of a service reported failure because the RabbitMQ connection is lost. After some time, k8s will restart the service, sending a SIGTERM signal. I am handling this signal to dispose resources, including shutting down the broker, but it hangs.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
cressie176commented, Jan 2, 2021

Hi @satazor,

This is a good find, thank you for reporting. The underlying issue is with how rascal serializes subscriber channel operations. When a connection error occurs two things are happening

  1. Rascal queues requests to obtain a new channel until the connection is re-established
  2. The subscriber requests a channel in order to re-subscribe and becomes blocked

Additionally, when your code calls broker.shutdown…

  1. Rascal iterates over all the subscriptions and cancels them. This involves getting the subscription’s channel.

Operations using the subscriptions channel are serialized via an in memory queue. (3) ends up being queued behind (2) which is blocked until the connection is re-established.

I’m not sure how easy it will be to detangle this, but will try to improve the behaviour if I can.

1reaction
cressie176commented, Sep 22, 2020

Hi @satazor,

Just letting you know I haven’t forgotten about this issue. Sorry it’s taking so long.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Kafka 3.3 Documentation
Restart the brokers one by one for the new protocol version to take effect. Once the brokers begin using the latest protocol version,...
Read more >
Paho Python MQTT Client - Working with Connections
A look at client connections and the Paho MQTT client. Includes examples of good and failed connections and re-connections.
Read more >
Troubleshoot pipeline runs - Azure DevOps - Microsoft Learn
Check the following to rule out network or other interruptions on the agent machine: Verify automatic updates are turned off.
Read more >
RFC 6665: SIP-Specific Event Notification
CANCEL Requests for SUBSCRIBE and NOTIFY Transactions . ... To ensure that subscribers do not wait indefinitely for a subscription to be established, ......
Read more >
Mail, Internet, or Telephone Order Merchandise Rule
If you don't want to wait, you may cancel your order and receive a prompt refund by calling our toll-free customer service number,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found