question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Make backoff policy in AbstractManagedChannelImplBuilder and subclasses customizable

See original GitHub issue

Currently AbstractManagedChannelImplBuilder.build() method creates backoff policy with default values:

  @Override
  public ManagedChannel build() {
    return new ManagedChannelOrphanWrapper(new ManagedChannelImpl(
        this,
        buildTransportFactory(),
        **new ExponentialBackoffPolicy.Provider(),**
        SharedResourcePool.forResource(GrpcUtil.SHARED_CHANNEL_EXECUTOR),
        GrpcUtil.STOPWATCH_SUPPLIER,
        getEffectiveInterceptors(),
        TimeProvider.SYSTEM_TIME_PROVIDER));
  }

This makes it impossible to configure backoff policy when using sub-classes such as NettyChannelBuilder.

Ideally we should be able to customize backoff policy on each builder implementation (NettyChannelBuilder is the one that I specifically care about at the moment) and use default policy if custom value hasn’t been provided.

Please let me know if there are any other ways of configuring this policy or if I’m missing something.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:17 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
mfateevcommented, Sep 29, 2020

Our use case is business-level transactions going through the system. The number of clients connecting the backend service is usually small (less than 100) and each minute of the outage has a high business impact. So any 5-minute network or DB outage is extended another 2 minutes by the backoff configuration that we cannot control. We also hide the gRPC client behind our client side library, so we don’t ever expect any of the issues that you outlined as we control the maximum timeout.

Do you have any empirical data on how many clients are necessary to kill service or network with the reconnect requests only with 10-second maximum reconnect timeout, for example? I have a suspicion that your concern is mostly theoretical for 99.9% of real-life deployments outside of the google infra. And looking at the number of developers that asked for this change the lack of the feature causes real pain.

1reaction
hmendesB2C2commented, Dec 9, 2021

I have a side car app running in kubernetes that receives about 5k GRPC calls per second from a client, if there is a disconnect then the client can be waiting minutes for reconnecting. We need to be able to control the initial and max backoffs at least. It doesn’t make any sense to wait large seconds to reconnect. If my app was to kill the network then the 5k calls per second would have done that already.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Is it possible to customize the backoff policy used by a retry ...
Use a custom BackOffPolicy that delegates to the desired BackOffPolicy , depending on the lastThrowable in the retryContext .
Read more >
ExponentialBackOffPolicy (Spring Retry 1.2.2.RELEASE API)
Implementation of BackOffPolicy that increases the back off period for each retry attempt in a given set using the exponential function.
Read more >
Strategies Explained — Backoff-Utils 1.0.1 documentation
You can subclass it to create your own custom strategies, or you can supply one of our ready-made strategies as the strategy argument...
Read more >
BackoffPolicy (Elasticsearch: Core 2.3.0 API) - Javadoc.io
Notes for implementing custom subclasses: The underlying mathematical principle of BackoffPolicy are progressions which can be either finite or infinite ...
Read more >
SpringBoot Retry Random Backoff - Medium
Therefore, I wanted to make sure that failed transactions due to a DB deadlock ... Spring has an exponential random backoff policy that...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found