question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Subscribe experimental doesn't appear to be retrying on retryable exceptions, also doesn't surface the error.

See original GitHub issue

Full context at https://github.com/GoogleCloudPlatform/google-cloud-python/issues/4841#issuecomment-386988007 downwards.

gist with logs: https://gist.github.com/psalaberria002/e91956d94e82d706d65da7111a40ca24

Possible culprits:

  1. send() and recv() expectedly error on next(self.call), but if that was a retryable error that somehow re-bubbled, it would not trigger any callbacks.
  2. bidi uses rpc._wrapped because api_core doesn’t appropriately expose add_done_callback. This means that bidi unwraps all of the retry logic.

This doesn’t explain why retrying a perfectly normal looking error somehow still leads us to a broken stream, unless the newly opened stream also has an error. But it’s unclear why that wouldn’t happen when calling the initial RPC.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:43 (20 by maintainers)

github_iconTop GitHub Comments

2reactions
jam182commented, Jun 6, 2018

hey thanks!

here is some new logs, I could reproduce the error three times, it’s much harder in DEBUG mode: 1.

DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | waiting for recv.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 3 batched requests
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.leaser | The current p99 value is 10 seconds.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Renewing lease for 3 ack IDs.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 3 batched requests
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Snoozing lease management for 3.072772 seconds.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | recved response.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Scheduling callbacks for 4 messages.
INFO  | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Observed recoverable stream error 504 Deadline Exceeded
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
ERROR | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | The bidirectional RPC unexpectedly exited. <google.api_core.grpc_helpers._StreamingResponseIterator object at 0x7fa07d4248d0>
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 4 batched requests
INFO  | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Thread-ConsumeBidirectionalStream exiting
DEBUG | 12:47:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Empty queue and inactive call, exiting request generator.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 5 batched requests
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 5 batched requests
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | recved response.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Scheduling callbacks for 5 messages.
INFO  | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Observed recoverable stream error 504 Deadline Exceeded
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
ERROR | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | The bidirectional RPC unexpectedly exited. <google.api_core.grpc_helpers._StreamingResponseIterator object at 0x7f40a40322e8>
INFO  | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | RPC termination has signaled streaming pull manager shutdown.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping consumer.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 5 batched requests
INFO  | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Thread-ConsumeBidirectionalStream exiting
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping scheduler.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Cleanly exiting request generator.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping leaser.
INFO  | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Thread-LeaseMaintainer exiting.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping dispatcher.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.helper_threads | Exiting the QueueCallbackWorker.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping heartbeater.
INFO  | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.heartbeater | Thread-Heartbeater exiting.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Finished stopping manager.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | waiting for recv.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 6 batched requests
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 6 batched requests
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | recved response.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Scheduling callbacks for 8 messages.
INFO    | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Observed recoverable stream error 504 Deadline Exceeded
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
ERROR   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | The bidirectional RPC unexpectedly exited. <google.api_core.grpc_helpers._StreamingResponseIterator object at 0x7f829405a6a0>
INFO    | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | RPC termination has signaled streaming pull manager shutdown.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 9 batched requests
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping consumer.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Cleanly exiting request generator.
INFO    | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Thread-ConsumeBidirectionalStream exiting
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping scheduler.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping leaser.
INFO    | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Thread-LeaseMaintainer exiting.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping dispatcher.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
ERROR   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.helper_threads | Error in queue callback worker: 'NoneType' object has no attribute 'remove'
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/helper_threads.py", line 112, in __call__
    self._callback(items)
  File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/dispatcher.py", line 101, in dispatch_callback
    self.ack(batched_commands.pop(requests.AckRequest))
  File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/dispatcher.py", line 124, in ack
    self.drop(items)
  File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/dispatcher.py", line 132, in drop
    self._manager.leaser.remove(items)
AttributeError: 'NoneType' object has no attribute 'remove'
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.helper_threads | Exiting the QueueCallbackWorker.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping heartbeater.
INFO    | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.heartbeater | Thread-Heartbeater exiting.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Finished stopping manager.
1reaction
theacodescommented, Jun 6, 2018

Thanks, @jam182. let’s continue this over at #5444.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Spring Retry does not work on 2nd level of methods
Once getSubscriberAccount has been invoked and execution has passed through the proxy and into the @Service instance of the object, no reference to...
Read more >
Handling operation errors - Apollo GraphQL Docs
Apollo Client helps you handle these errors according to their type, enabling you to show appropriate information to the user when an error...
Read more >
Pokemon Catch - Generic Exception Handling In Apex
My main problem with catching the generic Exception is that it demonstrates that you do not know the expected behavior or failure paths...
Read more >
Release Notes — Airflow Documentation
TaskLogReader class, which does not have the above restrictions. If a sensor times out, it will not retry¶. Previously, a sensor is retried...
Read more >
Reactor 3 Reference Guide
Reactor 3 does not officially support or target Android (consider using ... retry , as its name indicates, lets you retry an error-producing ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found