question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Tracing information lost for RabbitTemplate.sendAndReceive timeouts

See original GitHub issue

Describe the bug We have a typical client-server setup, running on spring-cloud Hoxton.SR8:

  • the server (S) handles requests via a @RabbitListener;
  • the client © uses RabbitTemplate.sendAndReceive() to communicate with the server.

The problem is that if the server takes too long to get back with a response, the client times out & later on RabbitTemplate logsReply received after timeout for <correlationId> warnings. Those warning lack the tracing information, so they cannot be filtered by traceId for example in kibana.

Debugging the spring-sleuth code, I can see there’s instrumentation added for the © -> (S) leg of the request, which works fine, but no similar instrumentation is present for the return leg, i.e. (S) -> ©. Of course, in the normal (non-timed out) case, that wouldn’t be required, as all the MDC values are there for the duration of the current thread context. But in the case of a timeout, that context is gone, along with the MDC values.

This could be fixed if similar instrumentation was added to that 2nd leg:

  • (S) - AbstractRabbitListenerContainerFactory.setBeforeSendReplyPostProcessors() decorator;
  • © - RabbitTemplate.afterReceivePostProcessors decorator.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
jonatan-ivanovcommented, Feb 1, 2021

@ionel-sirbu-crunch Thank you for the project (especially for the docker-compose file). I was able to reproduce your issue and indeed it is a bug. As I mentioned in https://github.com/spring-cloud/spring-cloud-sleuth/issues/1825#issuecomment-763241993, this issue is related: https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660. Now, that I debugged your project, it is not just related, it is the exact same issue (just the trigger is different: timeout vs. explicit exception).

Please see the details in https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660 (e.g.: this comment: https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660#issuecomment-743582707). TL;DR: the tracing code in spring-amqp only wraps the the invocation of the listener not the invocation and the error handling together.

I suggested a fix for this but it is a breaking change, it can be considered for the 2.4 release of rabbit-amqp. Gary also suggested a workaround using a RabbitListenerErrorHandler , you can give it a try.

Since this seems like a dupe, I’m closing this issue, please comment on https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660 if you have questions about the fix or the workaround.

0reactions
ionel-sirbu-crunchcommented, Feb 1, 2021

Here you go: https://github.com/ionel-sirbu-crunch/zipkin-rabbit-test Please let me know if that works for you.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Spring AMQP
Publisher returns can also be used internally by the RabbitTemplate in send and receive operations. See Reply Timeout for more information.
Read more >
java - Reply timeout when using AsyncRabbitTemplate ...
asyncRabbitTemplate.sendAndReceive(..) will always expect a response from the consumer of the message, hence the timeout you are receiving.
Read more >
Request/Response Pattern with Spring AMQP - Reflectoring
In a different case, the server can take longer than usual for proceeding request and the client doesn't want to wait anymore and...
Read more >
3.1 使用Spring AMQP - wwsnowice
Channels used within the framework (e.g. RabbitTemplate ) will be reliably returned to the ... See the section called “Reply Timeout” for more...
Read more >
Reference - Steeltoe Documentation
Publisher returns can also be used internally by the RabbitTemplate in send and receive operations. See Reply Timeout for more information.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found