Subscription errors and does not reconnect when connection is cancelled by server
See original GitHub issueAfter upgrading @google-cloud/pubsub
to escape a cpu issue in @grpc/grpc-js
, we found that services using a long-running pubsub subscription began to crash periodically during local development and in GKE.
Issue
When the pubsub service closes the underlying http2 connecton (this happens after 15 minutes), the connected subscription instance emits an error
event and does not reconnect.
In a previous version of the library, the subscription instance would reconnect (and wouldn’t emit an error
event). However, it looks like the CANCELLED
status was removed from the RETRY_CODES
list found in src/pull-retry.ts
in this commit, which means we skip the the retry block and move on to destroy the stream with a StatusError
in src/message-stream.ts
here
Since the pubsub service reserves the right to cancel a connection at any time, I would expect this library to handle reconnecting when the pubsub service cancels said connection and not emit an error
message on the subscription instance.
The simplest workaround I’ve found so far is to manually add the retry code for CANCELLED
back into the RETRY_CODES
list before requiring @google-cloud/pubsub
proper:
const { RETRY_CODES } = require('@google-cloud/pubsub/build/src/pull-retry');
RETRY_CODES.push(1);
const { PubSub } = require('@google-cloud/pubsub');
Environment details
- OS: GKE
- Node.js version: 13.11.0
- package versions:
@google-cloud/pubsub
:1.7.2
google-gax
:1.15.2
@grpc/grpc-js
:0.7.9
Steps to reproduce
The simplest way to reproduce this issue is to create a subscription to a topic using the pubsub-emulator and wait 15 minutes.
Example reproduction code:
const { PubSub } = require('@google-cloud/pubsub');
const options = {
projectId: 'local-dev',
apiEndpoint: 'localhost:8085',
};
const pubsub = new PubSub(options);
const topicName = 'test-topic';
const subscriptionName = 'test-topic-subscription';
const run = async () => {
const topicDefinition = pubsub.topic(topicName);
await topicDefinition.get({ autoCreate: true });
const subscriptionDefinition = topicDefinition.subscription(subscriptionName);
const [subscription] = await subscriptionDefinition.get({ autoCreate: true });
subscription.on('message', (message) => {
console.log('message received', { message });
});
subscription.on('error', (error) => {
console.error(`Error in pubsub subscription: ${error.message}`, {
error, topicName, subscriptionName,
});
});
subscription.on('close', () => {
console.error('Pubsub subscription closed', {
topicName, subscriptionName,
});
});
console.log('handlers added to subscription');
};
run();
Running the above script without sending any messages to the topic yielded this error after 15 minutes:
Error in pubsub subscription: Call cancelled {
error: StatusError: Call cancelled
at MessageStream._onEnd ([...]/Development/temp/testing-pubsub/node_modules/@google-cloud/pubsub/build/src/message-stream.js:234:26)
at MessageStream._onStatus ([...]/Development/temp/testing-pubsub/node_modules/@google-cloud/pubsub/build/src/message-stream.js:271:18)
at ClientDuplexStreamImpl.<anonymous> ([...]/Development/temp/testing-pubsub/node_modules/@google-cloud/pubsub/build/src/message-stream.js:143:44)
at Object.onceWrapper (events.js:422:26)
at ClientDuplexStreamImpl.emit (events.js:315:20)
at Object.onReceiveStatus ([...]/Development/temp/testing-pubsub/node_modules/@grpc/grpc-js/build/src/client.js:391:24)
at Object.onReceiveStatus ([...]/Development/temp/testing-pubsub/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:303:181)
at Http2CallStream.outputStatus ([...]/Development/temp/testing-pubsub/node_modules/@grpc/grpc-js/build/src/call-stream.js:114:27)
at Http2CallStream.maybeOutputStatus ([...]/Development/temp/testing-pubsub/node_modules/@grpc/grpc-js/build/src/call-stream.js:153:22)
at Http2CallStream.endCall ([...]/Development/temp/testing-pubsub/node_modules/@grpc/grpc-js/build/src/call-stream.js:140:18) {
code: 1,
details: 'Call cancelled',
metadata: Metadata { internalRepr: Map(0) {}, options: {} }
},
topicName: 'test-topic',
subscriptionName: 'test-topic-subscription'
}
Issue Analytics
- State:
- Created 3 years ago
- Reactions:3
- Comments:37 (9 by maintainers)
Top GitHub Comments
Hey, good(?) news!
I only saw this with grpc-js, which is also an interesting data point. I’ll be bugging a grpc person about this probably.
We are currently experiencing this same issue using gcloud-sdk docker image. What’s the possible fix ? @feywind