question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Parent can only be null in a local root

See original GitHub issue

Describe the bug

I am using Spring Cloud Sleuth 3.0.1-SNAPSHOT (as of 2020-12-31). The bug is really sporadic but does cause the worfkflow engine (flowable, which I have instrumented manually) to not call the next tasks in the BPMN workflow.

The error is this:

Exception in thread "orders-flowable-async-job-executor-thread-4" java.lang.AssertionError: Bug (or unexpected call to internal code): parent can only be null in a local root!
	at brave.internal.recorder.PendingSpans.getOrCreate(PendingSpans.java:89)
	at brave.Tracer._toSpan(Tracer.java:410)
	at brave.Tracer.toSpan(Tracer.java:382)
	at brave.Tracer.toSpan(Tracer.java:376)
	at brave.LazySpan.span(LazySpan.java:141)
	at brave.LazySpan.context(LazySpan.java:40)
	at org.springframework.cloud.sleuth.brave.bridge.BraveSpan.context(BraveSpan.java:48)
	at org.springframework.cloud.sleuth.brave.bridge.BraveTracer.nextSpan(BraveTracer.java:52)
	at org.springframework.cloud.sleuth.instrument.async.TraceRunnable.run(TraceRunnable.java:61)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

I think it is on this line in TraceRunable: image

I am of the opinion that the TraceRunanble should not prevent the delegate Runnable in any way. We might benefit from a check before calling the nextSpan method or catching this exception somehow.

Sample

I can’t reliably reproduce it, it is sporadic. We have on the order of 100 tests using flowable but I get one error every now and then.

Flowable can be set up with a custom implementation of a JobExectuor. I have it set up like so:

 private AsyncExecutor initAsyncExecutor(String tennantId, final int maxConcurrent) {
        final DefaultAsyncJobExecutor asyncExecutor = new DefaultAsyncJobExecutor();
        asyncExecutor.setAutoActivate(false);

        // other setup, irrelevant to the case

        // customized default async job executor initialization with tracing info
        // org.flowable.job.service.impl.asyncexecutor.DefaultAsyncJobExecutor.initAsyncJobExecutionThreadPool
        asyncExecutor.setThreadPoolQueue(new ArrayBlockingQueue<>(queueSize));
        asyncExecutor.setExecutorService(
            new TraceableExecutorService(
                this.applicationContext,
                new ThreadPoolExecutor(
                    asyncExecutor.getCorePoolSize(),
                    asyncExecutor.getMaxPoolSize(),
                    asyncExecutor.getKeepAliveTime(), TimeUnit.MILLISECONDS,
                    asyncExecutor.getThreadPoolQueue(),
                    new BasicThreadFactory.Builder()
                        .namingPattern(tennantId + "-flowable-async-job-executor-thread-%d")
                        .build()
                )
            )
        );

        return asyncExecutor;
    }

P.S. I am really grateful to you guys for the libraries you produce. They make my job easier!

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:25 (14 by maintainers)

github_iconTop GitHub Comments

4reactions
andylintnercommented, Sep 8, 2021

FYI - I’ve submitted a fix for this to Brave in https://github.com/openzipkin/brave/pull/1306

In the meantime, here’s a hack for anyone else impacted to resolve it until that change can be merged and released:

@Aspect
class BraveTracerNextSpanRetryAspect {

    /*
     * Stupid hack to retry BraveTracer::nextSpan(span) because of a race condition wherein a span being loaded concurrent with the
     * span being flushed will throw an AssertionError. Since the flush will have completed, retrying succeeds.
     */
    @Around("execution(org.springframework.cloud.sleuth.Span org.springframework.cloud.sleuth.brave.bridge.BraveTracer.nextSpan(..))")
    public Object wrapNextSpan(ProceedingJoinPoint pjp) throws Throwable {
        try {
            return pjp.proceed();
        } catch (AssertionError e) {
            return pjp.proceed();
        }
    }
}

1reaction
marcingrzejszczakcommented, Mar 17, 2021

You can file an issue in Brave and maybe link this one?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Spring cloud sleuth: Bug (or unexpected call to internal code)
AssertionError: Bug (or unexpected call to internal code): parent can only be null in a local root! at brave.internal.recorder.PendingSpans.
Read more >
Using Spring Cloud Sleuth
Tracer - Using a tracer, you can create a root span capturing the critical path ... will be the parent // of the...
Read more >
Helpful NullPointerExceptions - G. Hunter Anderson
Attempting to read or assign a field when the parent object is null will result in a null pointer exception. Let's look at...
Read more >
Null safety - Kotlin
Note that this only works where b is immutable (meaning it is a local variable that is not modified between the check and...
Read more >
GraphQL specification
However, with few exceptions, most of GraphQL is expressed only in the original ... is raised on a non‐null value, the error propogates...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found