How to track async spans?
See original GitHub issueSo I think we’ll need to think about async or background tasks a little bit more. At this moment if you have lets say Service A calling Service B and Service B using a background task to send an email, you can do two things:
the background task is in the same trace
the background task is in a different trace
The second option is good UI wise, you can search for all those traces but you lose the connection to individual requests. The first option has the problem that if the background task is slow it looks like even ServiceA was slow. We have tried and even selecting by name or with filters in the UI, traces will always show fully and the total time be the time of the slowest.
Ideally the solution would be to mark that those spans happening in the background are in fact background or async spans and that we do not want to count them for the calculation of how much the response took.
The same solution would work for AJAX requests, which probably you do not want to compute them as the response time but still you want to have them connected to the request that originated them.
Technically maybe is a binary_annotation added to the span to mark it as async and the server would need a bunch of changes.
Do you think that is a good idea at all or it should be solved in some other different way?
Issue Analytics
- State:
- Created 8 years ago
- Reactions:1
- Comments:6 (3 by maintainers)
Top GitHub Comments
Agreed that a new child span in the parent trace is always a good idea. Also agree that it’s up to the application to decide is the background work should be a part of the same trace or not - for us it’s better as a separate trace, because the job can be queued and executed minutes after the main trace is finished, so merging them together creates various unwanted side effects, like the duration of the trace becomes all messed up, and the UI rendering is poor due to different scale.
w.r.t. capturing the parent, one idea that was discussed in OpenTracing was that capturing multiple parents of a span (e.g. via something like
span.add_parents(other_spans)
) can work not only for joins within the same trace, but also for the above use case of linking multiple traces. The root span of the background job trace will have the trace/span ID of the “separate span” that you mentioned from the parent trace.This seems to be more of a food-for-thought issue raised at the time when async spans were not yet first-class in the zipkin model or UI.
Nowadays, if you look at spring-cloud Sleuth for example, if you have an
@Async
service invocation it will create a span namedasync
automatically and nest everything happening in that invocation below it by default. In the UI it looks like this then:Ofcourse, if you’ld rather have the async operation as a separate trace you can disable this behaviour and set a key as annotation in both traces for correlation purposes if still needed.
All this just to illustrate that async spans can be tracked in the different ways like is described in this issue, but that it is up to the instrumentor to decide how. Closing this, if you feel that async spans should be somehow still be modeled more prominently or in a different way improving the current impl feel free to raise a separate issue with your suggestion !