question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Azure Monitor exporter timing out and throwing errors

See original GitHub issue

Problem

I am using the OpenCensus Azure extension to try to write trace information to App Insights. However, of the 2-3 dozen requests I have made to my server only one is showing up in App Insights and I keep getting timeout errors on my end and 500 errors from IIS on the other end. I am assuming that I’m just doing something wrong, but I can’t figure out what it is.

Environment

I am on macOS. Python version:

3.8.0 (default, Jan  8 2020, 13:35:00)
[Clang 10.0.1 (clang-1001.0.46.4)]

Package Versions:

opencensus==0.7.7
opencensus-context==0.1.1
opencensus-ext-azure==1.0.2
Flask==1.1.1
Flask-Cors==3.0.8

To Reproduce

Server code:

import os
import time
import random

from flask import Flask, jsonify, request
from flask_cors import CORS
from opencensus.ext.azure.trace_exporter import AzureExporter
from opencensus.trace.samplers import ProbabilitySampler
from opencensus.trace.tracer import Tracer

azure_exporter = AzureExporter(connection_string='InstrumentationKey=************')
tracer = Tracer(exporter=azure_exporter, sampler=ProbabilitySampler(1.0))

app = Flask(__name__)
CORS(app)

@app.route('/')
def handle_request():
    with tracer.span(name="handler.respond"):
        to_sleep = random.random() * 2
        time.sleep(to_sleep)
    return "done"

if __name__ == "__main__":
    app.run(debug=True, host='0.0.0.0', port=int(os.environ.get('PORT', 8080)))

client code:

$ curl localhost:8080

Expected Behaivor

No timeouts or 500 errors and every request producing a new trace in App Insights (since I am using a probability sampler with probability=1).

Actual Behaivor

I am seeing this error every few seconds:

Transient client side error HTTPSConnectionPool(host='dc.services.visualstudio.com', port=443): Read timed out. (read timeout=10.0).

and when I send a request I get this html back:

Screen Shot 2020-02-21 at 4 19 29 PM

Maybe this is just an app insights server outage, but I’m guessing that I’m doing something wrong. Let me know if there is other info that would be helpful.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:11 (6 by maintainers)

github_iconTop GitHub Comments

3reactions
jonasmiederercommented, Jun 4, 2020

I am currently facing the same problem, did you find a solution for that @MaxTaggart ?

1reaction
lzchencommented, Jun 11, 2020

@SanthoshMedide Yes this is not an SDK issue as it is an ingestion endpoint delay. See this comment. When this message appears, your telemetry is deemed as “failed retryable” and it should be attempting to send again once the ingestion service isn’t backed up anymore. You should be able to see your telemetry eventually in App insights.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshoot common connection issues to Azure SQL ...
These connection problems can be caused by reconfiguration, firewall settings, a connection timeout, incorrect login information, ...
Read more >
Azure Monitor supported metrics by resource type
You can export the platform metrics from the Azure monitor pipeline to ... Count of uncaught exceptions thrown in the server application.
Read more >
Diagnose exceptions in web apps with Application Insights
In this article. Set up exception reporting; Diagnose exceptions using Visual Studio; Diagnose failures using the Azure portal; Custom tracing ...
Read more >
Enable Azure Monitor OpenTelemetry for .NET, Node.js, and ...
In this article ... The Azure Monitor OpenTelemetry Exporter is a component that sends traces, and metrics (and eventually all application ...
Read more >
Troubleshoot pipeline runs - Azure DevOps - Microsoft Learn
A pipeline may run for a long time and then fail due to job time-out. Job timeout closely depends on the agent being...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found