question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

E2E service tracing via SQS not working

See original GitHub issue

I set up AWS Otel Agent version: 1.13.1-aws on a quarkus application using software.amazon.awssdk.services.sqs.SqsAsyncClient of AWS SDK 2.17.103 running in ECS container, getting trace segments exported by the AWS Otel Collector which are also visible in X-Ray like:

serviceA --send-> SQS serviceB --receive-> SQS

However, it seems to be not really possible to get the downstream included in the same trace like: serviceA -send-> SQS -receive-> serviceB

Seems to be highly related to this open/dead oldtimer: https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/3684 but even if it’s exactly the same problem would be nice to get a perspective, assuming that this is one of the most frequent usages of tracing in microservice architectures.

I checked the messages in the sqs queue and noticed that the attributes are empty, where I expected an AWS trace header to be set by auto-instrumentation. So I wonder now

  1. if this is a bug (https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/3684 ) or just not implemented
  2. if I am simply doing something wrong here
  3. if I have to take care for writing/reading the AWS trace header on sqs.send and sqs.receive (examples would be welcome, as I found only an example for aws sdk1 which is not compatible to aws sdk2: https://stackoverflow.com/questions/51954687/how-to-trace-a-request-through-an-sqs-queue-with-aws-x-ray )

Some further tech. context:

AWS Otel agent config

# service A
ENV OTEL_RESOURCE_ATTRIBUTES="service.name=serviceA"

# service B
ENV OTEL_RESOURCE_ATTRIBUTES="service.name=serviceB"

# service A+B
ENV JAVA_OPTIONS="
-Dquarkus.http.host=0.0.0.0 
-Djava.util.logging.manager=org.jboss.logmanager.LogManager 
-Dotel.propagators=tracecontext,baggage,xray 
-Dotel.instrumentation.common.default-enabled=true 
-Dotel.instrumentation.opentelemetry-annotations.enabled=true 
-Dotel.traces.sampler=always_on

AWS Otel collector log

2022-06-10T12:57:40.665Z	debug	awsxrayexporter@v0.51.0/awsxray.go:66	request: {
  TraceSegmentDocuments: [
  
  [n subsegments]....
  
	    "{\"name\":\"Sqs\",\"id\":\"29bfd1ac08145c6d\",\"start_time\":1654865858.891311,\"origin\":\"AWS::ECS::Container\",\"trace_id\":\"1-62a33fbc-519ae399b605ac458103fb19\",\"end_time\":1654865860.5327685,\"http\":{\"request\":{\"method\":\"POST\",\"url\":\"https://sqs.eu-central-1.amazonaws.com?Action=SendMessage\\u0026Version=2012-11-05\\u0026QueueUrl=https%3A%2F%2Fsqs.eu-central-1.amazonaws.com%2F123456789123%2Fmyqueue.fifo\\u0026MessageBody=%7B%22fipsId%22%3A3295020%2C%22loggerTypeId%22%3A20%2C%22loggerCount%22%3A0%7D\\u0026MessageGroupId=defaultMessageGroup\",\"user_agent\":\"aws-sdk-java/2.17.103 Linux/4.14.276-211.499.amzn2.x86_64 OpenJDK_64-Bit_Server_VM/17.0.3+7-LTS Java/17.0.3 vendor/Red_Hat__Inc. exec-env/AWS_ECS_FARGATE io/async http/NettyNio cfg/retry-mode/legacy\"},\"response\":{\"status\":200,\"content_length\":0}},\"fault\":false,\"error\":false,\"throttle\":false,\"aws\":{\"ecs\":{\"container\":\"ip-10-3-118-186.eu-central-1.compute.internal\",\"container_id\":\"4c9eb21f0854800be118/c02a425851a64c9eb21f0854800be118-1682837531\"},\"xray\":{\"sdk\":\"opentelemetry for java\",\"sdk_version\":\"1.13.0\",\"auto_instrumentation\":true},\"operation\":\"SendMessage\",\"request_id\":\"bfc2f028-c367-512e-8f3a-ab3581635091\",\"queue_url\":\"https://sqs.eu-central-1.amazonaws.com/123456789123/myqueue.fifo\"},\"metadata\":{\"default\":{\"aws.agent\":\"java-aws-sdk\",\"http.flavor\":\"1.1\",\"net.transport\":\"ip_tcp\",\"rpc.service\":\"Sqs\",\"rpc.system\":\"aws-api\",\"thread.id\":86,\"thread.name\":\"executor-thread-9\"}},\"namespace\":\"aws\",\"parent_id\":\"a2653d250d74fb85\",\"type\":\"subsegment\"}\n",
		
   [n subsegments]...
  
]

trace_id and parent_id are set and identical for all subsegments

Please let me know if I should provide more tech. details

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
lauritcommented, Jun 22, 2022

@eric-spence-code thanks for stepping up, I already implemented it yesterday (well, actually copied what aws sdk1 instrumentation was doing).

0reactions
syrcommented, Nov 1, 2022

I re-tested this locally by two quarkus apps (set up like in https://quarkus.io/guides/opentelemetry) incl. java agent, a localstack sqs queue, and local jaeger-aio as tracing backend. However, sendMessage and receiveMessage and still not in the same trace. I will try to create a reproducer soon, but will likely debug the tests of this before creating the reproducer.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Amazon SQS and AWS X-Ray
AWS X-Ray integrates with Amazon Simple Queue Service (Amazon SQS) to trace messages that are passed through an Amazon SQS queue. If a...
Read more >
5 Common Amazon SQS Issues - Dashbird
Here you will find the most common issues when working with SQS, especially when starting with the service.
Read more >
Tracking a message on multiple AWS SQS queues
It can track a single transaction end-to-end through each process. I think it's just a matter of including a library and activating X-Ray....
Read more >
Amazon SQS and Amazon X-Ray - 亚马逊云科技
Amazon X-Ray integrates with Amazon Simple Queue Service (Amazon SQS) to trace messages that are passed through an Amazon SQS queue. If a...
Read more >
How to Instrument AWS Services with OpenTelemetry - Aspecto
We will use AWS SQS, DynamoDB and Lambda. ... The A team using OpenTelemetry ... Running the Services and Visualizing with Aspecto ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found