Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Can we define a shared thread-local storage format for Spans?

See original GitHub issue

While explicit context propagation is in many cases the ideal solution, it’s not always practical and often requires that people change their code significantly. A thread-local storage (TLS) of the “current span” is widely used approach in various Java tracers. Rather than leaving this issue up for implementations, it would be useful, imo, to define a common mechanism directly in OpenTracing, and increase interoperability between different instrumentations. Specifically, we’d need to define at least two APIs:

saving and retrieving the “current span”
providing a wrapper Runnable / Callable / etc. for carrying the “current span” over thread boundaries

For (1), the two possible approaches are:

(1a) a single-span TLS
(1b) a stack-like TLS

Some prior discussions on Gitter:

yurishkuro Mar 24 12:33 @bensigelman @dkuebric @adriancole for in-process context propagation in Java, what are your thoughts on what thread-local storage should contain: a single current Span, or a stack of spans?

bensigelman Mar 26 00:02 @yurishkuro sorry for the delay re ^^^, just noticed this! this is a “rich” (i.e., complex) topic, but the short answer IMO is “a single current Span”.

bensigelman Mar 26 00:32 @yurishkuro @dkuebric @michaelsembwever perhaps more interesting (IMO) is what helpers exist to set and clear that span… One idea is something like the following:

public abstract class TracedRunnable implements Runnable {
    private SpanBuilder childBuilder;
    public TracedRunnable(SpanBuilder childBuilder) {
        this.childBuilder = childBuilder;
    }

    // For subclasses to override… akin to java.lang.Runnable.run().
    public abstract void runInSpan(Span sp);

    // The compatibility bridge to java.lang.Runnable
    public final run() {
        Span parentSpan = Pseudocode.getSpanFromTLS();
        Span childSpan = childBuilder.setParent(parentSpan).start();
        runInSpan(childSpan);
        // We need to reinstate parentSpan before returning from run().
        Pseudocode.setSpanInTLS(parentSpan);
    }
}

Calling code would look approximately like this:

SpanBuilder b = tracer.buildSpan(“child_operation”);
somethingThatNeedsARunnableClosure(arg1, arg2, new SpanRunnable(b) {
    public void runInSpan(Span sp) {
        … do something, etc …
    }
});

PS: @jmacd thoughts/recollections about the above? It’s not precisely what I remember from Java dapper, though maybe that’s not a bad thing. 😃

yurishkuro Mar 26 18:08 @bensigelman the last line, Pseudocode.setSpanInTLS(parentSpan); is the reason I asked - it works in this simple example (if you add try/finally), but isn’t that great in things like Jersey filters, because there are two independent filters, for request and for response. To to be able to reference parentSpan it needs to be stored somewhere else, since TLS slot is occupied by the child span. If TLS slot was instead a stack, then the request filter would do TLS.push(childSpan), and response filter would do child = TLS.pop(); child.end(), thus automatically restoring the parent. what I don’t like about the stack is a possibility of scoping errors, leading to mismatched push/pop

bensigelman Mar 26 22:55 @yurishkuro in the case of something like Jersey, I suppose the Span could be attached to the ContainerRequestContext. But that isn’t satisfying at all… doesn’t seem general enough. Having thought this through more carefully, I would argue (albeit with hesitation) for something like a stack… every Span would have an optional parent Span, and at finish-time the parent Span would be reinstated into TLS. In the example you gave above, you’d get the behavior you want with TLS as long as a Span is created at Jersey request filter time, and the same span is finished at Jersey response filter time. (It can of course create children of its own, etc, etc) If the above doesn’t make any sense, maybe there’s some sort of git gist or something we could use as a place to make the problem more concrete?

Issue Analytics

State:
Created 7 years ago
Comments:9 (7 by maintainers)

Top GitHub Comments

1reaction

toalercommented, Mar 30, 2016

We ended up creating a Span factory that provides static methods to create/destroy spans that set/remove the span from thread local. The factory also allows us to get the span in context on the corresponding thread (similar to TraceContext.getCurrentSpan()). Have not considered cross thread communication as of yet.

0reactions

bhscommented, May 24, 2017

(Closed by #115)