question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Duplicated Dynatrace PurePaths because Sleuth does not close spans properly

See original GitHub issue

Describe the bug We found an issue on Dynatrace while we’re trying the integration between Spring Boot Cloud Sleuth + OpenTelemetry + Dynatrace.

Dynatrace has since years the concept of “trace/span” and it is named “PurePath”. Since months, it is possible to enrich PurePaths with OpenTelemetry details and attributes.

Unfortunately, by using a POC we detected that an HTTP request generates 2 PurePaths (instead of 1) and the root cause seems an issue on Sleuth that does not close the span properly.

Sample

The requirements for reproducing this issue are:

  • JDK 11
  • A Dynatrace subscription (it is possible to get a free account for 15-days)
  • An host with Dynatrace OneAgent installed where to run the POC

The steps to reproduce the issue are:

  • To build the Spring Boot app hosted here: https://github.com/diepet/spring-boot-dynatrace-otel-poc
  • To run the Spring Boot app in the host with Dynatrace OneAgent installed
  • To check the Services in Dynatrace for that process: one will be PurchaseOrderController (correct), and another one will be “Requests executed in background threads of SpringBoot purchase-order com.acme.order.PurchaseOrderApp” (that is wrong and contains only broken PurePaths).

The POC has been developed by using:

  • JDK 11
  • Spring Boot ver.2.6.1
  • Spring Cloud ver.2021.0.0
  • OpenTelemetry 1.9.0
  • Spring Cloud Otel ver.1.1.0-M4

Dynatrace Support Details

A support ticket has been already open to Dynatrace: request number 22425, access forbidden externally.

And they replied with the following details.

It looks like we have found the issue and discovered that the main sequence is:

  1. startSpan (span A)
  2. ContextMakeCurrent (span A)
  3. startSpan (span B)
  4. ContextMakeCurrent (span B)
  5. startSpan (span C)
  6. ContextMakeCurrent (span B)
  7. endSpan (span B)
  8. endSpan (span B)
  9. closeScope (span A)
  10. endSpan (span A)

Every startSpan should have a corresponding endSpan. Every ContextMakeCurrent should have a corresponding closeScope.

However, this works differently in the app:

  • Step 6 fails because span B is already active.
  • Step 7 fails because scope was not closed yet.
  • Step 8 fails because scope was not closed yet.
  • Step 9 fails because span A is not active at that time (still span B)

We suppose the root cause is a bug in Spring Sleuth itself (not Spring Sleuth OpenTelemetry integration). Specifically in TracePlatformTransactionManager::commit:

https://github.com/spring-cloud/spring-cloud-sleuth/blob/v3.1.0/spring-cloud-sleuth-instrumentation/src/main/java/org/springframework/cloud/sleuth/instrument/tx/TracePlatformTransactionManager.java#L106

Here span is ended:

https://github.com/spring-cloud/spring-cloud-sleuth/blob/v3.1.0/spring-cloud-sleuth-instrumentation/src/main/java/org/springframework/cloud/sleuth/instrument/tx/TracePlatformTransactionManager.java#L130

Here thread local is updated:

https://github.com/spring-cloud/spring-cloud-sleuth/blob/v3.1.0/spring-cloud-sleuth-instrumentation/src/main/java/org/springframework/cloud/sleuth/instrument/tx/TracePlatformTransactionManager.java#L135

But nowhere is scope closed.

Previously scope gets created here:
at io.opentelemetry.context.Context.makeCurrent(Context.java)
        at io.opentelemetry.context.ImplicitContextKeyed.makeCurrent(ImplicitContextKeyed.java:33)
        at org.springframework.cloud.sleuth.otel.bridge.OtelSpanInScope.<init>(OtelSpanInScope.java:41)
        at org.springframework.cloud.sleuth.otel.bridge.OtelTracer.withSpan(OtelTracer.java:64)
        at org.springframework.cloud.sleuth.ThreadLocalSpan.set(ThreadLocalSpan.java:51)
        at org.springframework.cloud.sleuth.instrument.tx.TracePlatformTransactionManager.taggedSpan(TracePlatformTransactionManager.java:101)
        at org.springframework.cloud.sleuth.instrument.tx.TracePlatformTransactionManager.getTransaction(TracePlatformTransactionManager.java:73)
        at org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(TransactionAspectSupport.java:595)

https://github.com/spring-cloud/spring-cloud-sleuth/blob/v3.1.0/spring-cloud-sleuth-instrumentation/src/main/java/org/springframework/cloud/sleuth/instrument/tx/TracePlatformTransactionManager.java#L101

https://github.com/spring-cloud/spring-cloud-sleuth/blob/v3.1.0/spring-cloud-sleuth-api/src/main/java/org/springframework/cloud/sleuth/ThreadLocalSpan.java#L51

https://github.com/spring-projects-experimental/spring-cloud-sleuth-otel/blob/v1.1.0-M4/spring-cloud-sleuth-otel/src/main/java/org/springframework/cloud/sleuth/otel/bridge/OtelTracer.java#L64

https://github.com/spring-projects-experimental/spring-cloud-sleuth-otel/blob/v1.1.0-M4/spring-cloud-sleuth-otel/src/main/java/org/springframework/cloud/sleuth/otel/bridge/OtelSpanInScope.java#L41

Notice the JavaDoc of Tracer::withSpan:

Makes the given span the “current span” and returns an object that exits that scope on close. Calls to currentSpan() and currentSpanCustomizer() will affect this span until the return value is closed. The most convenient way to use this method is via the try-with-resources idiom. When tracing in-process commands, prefer startScopedSpan(String) which scopes by default. Note: While downstream code might affect the span, calling this method, and calling close on the result have no effect on the input. For example, calling close on the result does not finish the span. Not only is it safe to call close, you must call close to end the scope, or risk leaking resources associated with the scope. Params: span – span to place into scope or null to clear the scope Returns: scope with span in it

https://github.com/spring-cloud/spring-cloud-sleuth/blob/v3.1.0/spring-cloud-sleuth-api/src/main/java/org/springframework/cloud/sleuth/Tracer.java#L103

It clearly states that a created scope must be closed again. And TracePlatformTransactionManager::commit fails to do so.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
eanton86commented, Feb 2, 2022

Hi, I seem to come across the same issue, but it comes to me easier. An operation annotated with @Transactional without an existing trace creates a new trace and span, but doesn’t close it properly, so the traceId stays in the LocalThread context to the next transactions. I’ve created a sample project to reproduce it easily: https://github.com/eanton86/test-transaction-traicing

0reactions
marcingrzejszczakcommented, Feb 2, 2022

Check Sleuth 3.1.1-SNAPSHOT or release train 2021.0.1-SNAPSHOT

Read more comments on GitHub >

github_iconTop Results From Across the Web

Span settings | Dynatrace Docs
Learn how to configure span settings for OpenTelemetry and OpenTracing. ... The span settings are available at Settings > Server-side service monitoring.
Read more >
OpenTelemetry all Spans have informational message: "initial ...
For all Span Attributes the following message is shown in Dynatrace: "initial value not set" and there is a general message on the...
Read more >
Errors of PurePath® capture | Dynatrace Docs
In case some data isn't captured correctly or is lost in transition, Dynatrace labels the affected distributed trace with a diagnostic message.
Read more >
Distributed tracing by PurePath® technology | Dynatrace Docs
When an activity—a parent span—is completed, the next activity passes to its child span. The distributed trace places these spans in their correct...
Read more >
OpenTracing | Dynatrace Docs
Dynatrace OneAgent for Java automatically collects OpenTracing span data and integrates it into end-to-end PurePath® distributed traces.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found