Duplicated Dynatrace PurePaths because Sleuth does not close spans properly
See original GitHub issueDescribe the bug We found an issue on Dynatrace while we’re trying the integration between Spring Boot Cloud Sleuth + OpenTelemetry + Dynatrace.
Dynatrace has since years the concept of “trace/span” and it is named “PurePath”. Since months, it is possible to enrich PurePaths with OpenTelemetry details and attributes.
Unfortunately, by using a POC we detected that an HTTP request generates 2 PurePaths (instead of 1) and the root cause seems an issue on Sleuth that does not close the span properly.
Sample
The requirements for reproducing this issue are:
- JDK 11
- A Dynatrace subscription (it is possible to get a free account for 15-days)
- An host with Dynatrace OneAgent installed where to run the POC
The steps to reproduce the issue are:
- To build the Spring Boot app hosted here: https://github.com/diepet/spring-boot-dynatrace-otel-poc
- To run the Spring Boot app in the host with Dynatrace OneAgent installed
- To check the Services in Dynatrace for that process: one will be PurchaseOrderController (correct), and another one will be “Requests executed in background threads of SpringBoot purchase-order com.acme.order.PurchaseOrderApp” (that is wrong and contains only broken PurePaths).
The POC has been developed by using:
- JDK 11
- Spring Boot ver.2.6.1
- Spring Cloud ver.2021.0.0
- OpenTelemetry 1.9.0
- Spring Cloud Otel ver.1.1.0-M4
Dynatrace Support Details
A support ticket has been already open to Dynatrace: request number 22425, access forbidden externally.
And they replied with the following details.
It looks like we have found the issue and discovered that the main sequence is:
startSpan
(span A)ContextMakeCurrent
(span A)startSpan
(span B)ContextMakeCurrent
(span B)startSpan
(span C)ContextMakeCurrent
(span B)endSpan
(span B)endSpan
(span B)closeScope
(span A)endSpan
(span A)
Every startSpan
should have a corresponding endSpan
.
Every ContextMakeCurrent
should have a corresponding closeScope
.
However, this works differently in the app:
- Step 6 fails because span B is already active.
- Step 7 fails because scope was not closed yet.
- Step 8 fails because scope was not closed yet.
- Step 9 fails because span A is not active at that time (still span B)
We suppose the root cause is a bug in Spring Sleuth itself (not Spring Sleuth OpenTelemetry integration).
Specifically in TracePlatformTransactionManager::commit
:
Here span is ended:
Here thread local is updated:
But nowhere is scope closed.
Previously scope gets created here:
at io.opentelemetry.context.Context.makeCurrent(Context.java)
at io.opentelemetry.context.ImplicitContextKeyed.makeCurrent(ImplicitContextKeyed.java:33)
at org.springframework.cloud.sleuth.otel.bridge.OtelSpanInScope.<init>(OtelSpanInScope.java:41)
at org.springframework.cloud.sleuth.otel.bridge.OtelTracer.withSpan(OtelTracer.java:64)
at org.springframework.cloud.sleuth.ThreadLocalSpan.set(ThreadLocalSpan.java:51)
at org.springframework.cloud.sleuth.instrument.tx.TracePlatformTransactionManager.taggedSpan(TracePlatformTransactionManager.java:101)
at org.springframework.cloud.sleuth.instrument.tx.TracePlatformTransactionManager.getTransaction(TracePlatformTransactionManager.java:73)
at org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(TransactionAspectSupport.java:595)
Notice the JavaDoc of Tracer::withSpan
:
Makes the given span the “current span” and returns an object that exits that scope on close. Calls to
currentSpan()
andcurrentSpanCustomizer()
will affect this span until the return value is closed. The most convenient way to use this method is via the try-with-resources idiom. When tracing in-process commands, preferstartScopedSpan(String)
which scopes by default. Note: While downstream code might affect the span, calling this method, and calling close on the result have no effect on the input. For example, calling close on the result does not finish the span. Not only is it safe to call close, you must call close to end the scope, or risk leaking resources associated with the scope. Params: span – span to place into scope or null to clear the scope Returns: scope with span in it
It clearly states that a created scope must be closed again. And TracePlatformTransactionManager::commit
fails to do so.
Issue Analytics
- State:
- Created 2 years ago
- Comments:11 (5 by maintainers)
Hi, I seem to come across the same issue, but it comes to me easier. An operation annotated with
@Transactional
without an existing trace creates a new trace and span, but doesn’t close it properly, so the traceId stays in the LocalThread context to the next transactions. I’ve created a sample project to reproduce it easily: https://github.com/eanton86/test-transaction-traicingCheck Sleuth
3.1.1-SNAPSHOT
or release train2021.0.1-SNAPSHOT