question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Distributed Tracing with APM server; with Python

See original GitHub issue

I’ve been doing POC tests on different Tracing technologies (Jaeger, Zipkin, Stackdriver trace, Istio (still Jaeger or Zipkin. Different concepts though)), and now I am with Elasticsearch APM module, and I see it is more or less the same concept. You start counting a trace when you start a request, and end it when you get a response.

I’ve generated some traces, and am able to see them on Kibana, but I see the traces separately, which makes sense, as every time; in each service I initialize a new Tracer object, and this gets a new ID.

Now, when I want to see the cascade view of several service spans, or several spans of the same service, I pass the Trace ID to the next service, so this one will initialize the tracer with this ID, and will generate a new Span ID, and attach to the same trace.

I’ve been reading the docs, for Python, and the only method that suits this is elasticapm.set_context(), but everything I found in the docs is this:

def set_context(data, key=“custom”): “”" Attach contextual data to the current transaction and errors that happen during the current transaction. If the transaction is not sampled, this function becomes a no-op. :param data: a dictionary, or a callable that returns a dictionary :param key: the namespace for this data “”" …

I would like to know if this is the right way of doing this, or I am completely off the track.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:11 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
basepicommented, Feb 5, 2020

Alright, I see the disconnect. I have a working example, modified from your example above, in this gist

The problem was that you didn’t quite have the instrumentation for service 2a/b and 3 correct. When you were looking at the trace, all you were seeing were the transaction and two spans (for the two network calls) from service 1.

This is because our instrumentation for flask requires us to connect to flask’s signals, which we only do if you set up our flask integration, as documented here.

Otherwise the flask routing doesn’t get instrumented, which means that while our headers are there from service 1, the agent doesn’t know to look for them. In order to create a transaction that actually uses incoming http headers, you have to use begin_transaction with a TraceParent object, like we do here (in the flask integration code). So, if you ever needed to do distributed tracing in an unsupported framework, you’d do something like that.

Luckily, if you use our official integrations, we do all that hard work for you!

This is the waterfall I see when I run the example in my gist:

image

Much better!

Please keep me posted if anything I explained wasn’t clear. We’re here to help!

0reactions
beniwohlicommented, Apr 22, 2020

It looks like all questions have been addressed. I’ll close this for now 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

Distributed tracing for your Python services
Standard distributed tracing for APM agents (above) captures up to 10% of your traces, but if you want us to analyze all your...
Read more >
Distributed tracing | APM User Guide [8.5] - Elastic
Distributed tracing enables you to analyze performance throughout your microservice architecture by tracing the entirety of a request — from the initial web ......
Read more >
Tracing Python Applications - Datadog Docs
Configure the Datadog Agent for APM. Install and configure the Datadog Agent to receive traces from your now instrumented application. By default the...
Read more >
The gentle touch of APM - how code tracing works in Python
It has been a busy several years in monitoring and observability. As we've hit limits on the visibility and detail that logging and...
Read more >
Application Performance Monitoring: Instrumenting Python ...
Oracle APM: Python code instrumentation example. ... APM supports the Zipkin distributed tracing system, so our Python code is instrumented using Zipkin.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found