[PROPOSAL] First-class distributed tracing
See original GitHub issueOver the past weeks, I’ve been struggling to get distributed tracing to work.
My observation is that a Java application that uses the Dapr SDK for Java does not propagate the traceparent
HTTP-header from a call it receives from the sidecar back to the sidecar (e.g. one that does invokeMethod
or publishEvent
).
Looking at the code, I think that if the sidecar would use gRPC to invoke the Java application, it wouldn’t work either - the grpc-trace-bin
, traceparent
or tracestate
headers are propagated but read from an empty Context, it seems. But I haven’t been able to let the sidecar talk gRPC to my Java apps, so I’m not 100% sure about this.
Nevertheless, in a platform that wants to simplify distributed systems, I think tracing is an essential thing. I’d love to see the Dapr SDK for Java make it as easy as possible to leverage what Dapr has to offer in this field.
Describe the proposal
Provide the Dapr client code (at least DaprClientBuilder
, DaprHttpBuilder
, DaprClientHttp
and DaprClientGrpc
classes, maybe more) with a “strategy” to get a traceparent
header value. Use that strategy to enrich calls from the application to the sidecar.
The default implementation should return an empty value. A user can supply their own implementation based on the frameworks they are using. For instance, I could envision an implementation that uses OpenTelemetry’s Span
class to find the correct value for the traceparent
header.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:2
- Comments:10 (7 by maintainers)
Top GitHub Comments
Good news - I was able to get distributed tracing to work with Dapr & Spring Sleuth, too. Even better, I can make that fit in the same structure as the one that is used with OpenTelemetry: a
Function<reactor.util.context.Context, reactor.util.context.Context>
.Since Sleuth auto-populates the Reactor
Context
with the SleuthTraceContext
, the “tracing context enricher” could be a singleton - it’s logic is completely implemented without instance variables.For OpenTracing, the “enricher” does need an instance variable (see this example) but I think that’s not a big of a problem.
Aside
The Sleuth integration works when the app communicaties with the sidecar over gRPC, but does not when they use HTTP. Sleuth puts a few more entries in the React
Context
and some of their names contain a space, e.g."interface org.springframework.cloud.sleuth.Tracer"
. Even when you remove those entries in aReactor.contextWrite(.....)
, they pop back up when the ReactorContext
is used to write HTTP headers; and HTTP header names do not allow spaces. I think the fact that you can’t remove something from the context is a bug, but I need to dive into that further.So, to get back to the original proposal:
Describe the proposal
Provide the Dapr client with a “strategy” to enrich the Reactor Context with tracing information. The
DaprClientGrpc
andDaprClientHttp
classes will invoke that strategy to enrich calls from the application to the sidecar with a tracing context.The contract would at the very minimum look like this:
In order to push implementors to providing useful values, we could make it a bit more expressive by providing a default implementation, and delegating to unimplemented values:
The default implementation could return an empty value. The Dapr SDK for Java could provide out-of-the box implementations for OpenTracing and Sleuth, two popular tracing libraries. A user could of course also supply their own implementation based on the frameworks they are using.
Because those implementations depend on external libraries, we will introduce “optional” dependencies to both the OpenTracing and the Sleuth API’s.
I don’t think so. My expectation is that it’s more a matter of wiring the particular tracing library (Sleuth, OpenTracing, etc.) into the Dapr SDK.
Minimising dependencies absolutely makes sense to me, and I think unless there’s a lot of demand from the community, we shouldn’t be providing our own implementations at this point. We could deliver sample implementations, or, as @wmeints suggests, a blog post that explains how to do it.
My main doubt (still!) with the idea is whether it’s actually going to work. I simply don’t know for sure whether we will be able to guarantee that the Reactor
Context
we are passing to the interface impl. is going to be the correct one. I lack some in-depth knowledge about Reactor to be sure about it. The implementations will need to be 100% thread-safe, that’s for sure. Face it: if we’d pass the wrong ReactorContext
, we’d be more wrong than the current situation, which simply doesn’t pass anyContext
.