Tracing information lost for RabbitTemplate.sendAndReceive timeouts
See original GitHub issueDescribe the bug We have a typical client-server setup, running on spring-cloud Hoxton.SR8:
- the server (S) handles requests via a 
@RabbitListener; - the client © uses 
RabbitTemplate.sendAndReceive()to communicate with the server. 
The problem is that if the server takes too long to get back with a response, the client times out & later on RabbitTemplate logsReply received after timeout for <correlationId> warnings. Those warning lack the tracing information, so they cannot be filtered by traceId for example in kibana.
Debugging the spring-sleuth code, I can see there’s instrumentation added for the © -> (S) leg of the request, which works fine, but no similar instrumentation is present for the return leg, i.e. (S) -> ©. Of course, in the normal (non-timed out) case, that wouldn’t be required, as all the MDC values are there for the duration of the current thread context. But in the case of a timeout, that context is gone, along with the MDC values.
This could be fixed if similar instrumentation was added to that 2nd leg:
- (S) - 
AbstractRabbitListenerContainerFactory.setBeforeSendReplyPostProcessors()decorator; - © - 
RabbitTemplate.afterReceivePostProcessorsdecorator. 
Issue Analytics
- State:
 - Created 3 years ago
 - Comments:8 (3 by maintainers)
 

Top Related StackOverflow Question
@ionel-sirbu-crunch Thank you for the project (especially for the docker-compose file). I was able to reproduce your issue and indeed it is a bug. As I mentioned in https://github.com/spring-cloud/spring-cloud-sleuth/issues/1825#issuecomment-763241993, this issue is related: https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660. Now, that I debugged your project, it is not just related, it is the exact same issue (just the trigger is different: timeout vs. explicit exception).
Please see the details in https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660 (e.g.: this comment: https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660#issuecomment-743582707). TL;DR: the tracing code in spring-amqp only wraps the the invocation of the listener not the invocation and the error handling together.
I suggested a fix for this but it is a breaking change, it can be considered for the 2.4 release of rabbit-amqp. Gary also suggested a workaround using a
RabbitListenerErrorHandler, you can give it a try.Since this seems like a dupe, I’m closing this issue, please comment on https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660 if you have questions about the fix or the workaround.
Here you go: https://github.com/ionel-sirbu-crunch/zipkin-rabbit-test Please let me know if that works for you.