question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Production - [Alerting] DotNetEng Status Failed Requests/Hour alert

See original GitHub issue

💔 Metric state changed to alerting

The number of failed DotNetEng Status requests per hour is above 20. This may indicate a systemic problem that needs to be investigated. To intially investigate prod, run the following query in DotNetEng-Status-Prod, and to investigate staging, run the query in DotNetEng-Status-Staging:

union exceptions, traces
| project timestamp, operation_Name, customDimensions, message, problemId, details
| order by timestamp asc
  • failuresCount 22

Metric Graph

Go to rule

@dotnet/dnceng, please investigate

Automation information below, do not change

Grafana-Automated-Alert-Id-d2dd705a6c724ed68fcf6955561c06dd

Issue Analytics

  • State:closed
  • Created 6 months ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
riarenascommented, Mar 28, 2023

I tested the failing codepath’s scenario (editing a pull request title or body, and handle any mentions of the dnceng team). Things are now working as expected. Given that the alert cleared I’m starting to think this was a problem with the data in the GitHub payload. We failed to process a lot of PRs while this was happening, and things cleared up by themselves. Going to close this and keep an eye for another alert to see if it makes sense to harden against this somehow.

0reactions
riarenascommented, Mar 28, 2023

@dotnet/dnceng This is a test, please disregard.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Request Error Percent Alerts
We are experiencing 'Request Error Percent' alerts that are not being cleared from the Open Alerts filter. It has occurred numerous times alongside...
Read more >
Monitor alerts history
Within the Redox dashboard, you can monitor the alerts we send about an endpoint not receiving the expected patient data. Monitoring alerts can...
Read more >
Alerting overview | Cloud Monitoring
Alerting gives timely awareness to problems in your cloud applications so you can resolve the problems quickly. In Cloud Monitoring, an alerting policy ......
Read more >
CI is Alerting
For instance, a poorly-thought out or insufficiently-predictive alert triggering in production can unnecessarily indicate a bad rollout, slowing ...
Read more >
Troubleshooting Azure Monitor alerts and notifications
Common issues with Azure Monitor alerts and possible solutions. ... (succeeded or failed), by not filtering on the event status field.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found