question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Ocasional NullPointerException in io.grpc.internal.RetriableStream.drain(RetriableStream.java:279) when using hedgingPolicy

See original GitHub issue

What version of gRPC-Java are you using?

Version 1.42.2

What is your environment?

Linux

What did you expect to see?

I’m using the hedging retry policy via configuration and ocasionally see a NullPointerException pop up.

Here’s a snippet of the Kotlin code that configures the hedging policy with an 85ms hedging delay:

.defaultServiceConfig(
    mapOf(
        "loadBalancingPolicy" to "round_robin",
        "methodConfig" to listOf(
            mapOf(
                "name" to listOf(
                    mapOf(
                        "service" to "my.org.Service",
                        "method" to "MyMethod"
                    )
                ),
                "waitForReady" to true,
                "hedgingPolicy" to mapOf(
                    "maxAttempts" to 2.
                    "hedgingDelay" to "0.085s",
                    "nonFatalStatusCodes" to listOf(Status.UNAVAILABLE.code.name)
                )
            )
        )
    )
)

What did you see instead?

java.lang.NullPointerException: null
    at io.grpc.internal.RetriableStream.drain(RetriableStream.java:279)
    at io.grpc.internal.RetriableStream.access$1100(RetriableStream.java:55)
    at io.grpc.internal.RetriableStream$HedgingRunnable$1.run(RetriableStream.java:476)
    at io.micrometer.core.instrument.composite.CompositeTimer.record(CompositeTimer.java:79)
    at io.micrometer.core.instrument.Timer.lambda$wrap$0(Timer.java:160)
    at io.micronaut.scheduling.instrument.InvocationInstrumenterWrappedRunnable.run(InvocationInstrumenterWrappedRunnable.java:47)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.lang.Thread.run(Thread.java:834)

Steps to reproduce the bug

Enable hedging policy above

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:9 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
ejona86commented, May 19, 2022

I don’t think those guarded by things are that important. All those locks are the same instance, just aliases. So it is more just that the code is written in a way that isn’t able to be verified by tooling.

1reaction
ejona86commented, May 18, 2022

Since writing a message has to be done after start(), it feel like that buffer leak should be unrelated. But that stack trace is in the same hedging draining code, which does make it suspiciously related.

Read more comments on GitHub >

github_iconTop Results From Across the Web

core/src/main/java/io/grpc/internal/RetriableStream.java
abstract class RetriableStream<ReqT> implements ClientStream {. @VisibleForTesting. static final Metadata.Key<String> GRPC_PREVIOUS_RPC_ATTEMPTS = Metadata.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found