question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Prometheus async worker thread crashes after upgrade

See original GitHub issue

Steps to reproduce

Upgraded jenkins with previously working prometheus metrics plugin. Using the official Jenkins docker image / alpine. Upgraded jenkins core from 2.176.2 to 2.176.3, upgraded prometheus plugin from 2.0.0 to 2.0.6

On startup, and continuing periodically after, I see the following stack trace in my logs:


Sep 18, 2019 5:11:26 PM hudson.model.AsyncPeriodicWork$1 run
INFO: Started prometheus_async_worker
Sep 18, 2019 5:11:26 PM hudson.init.impl.InstallUncaughtExceptionHandler$DefaultUncaughtExceptionHandler uncaughtException
SEVERE: A thread (prometheus_async_worker thread/194) died unexpectedly due to an uncaught exception, this may leave your Jenkins in a bad way and is usually indicative of a bug in the code.
java.lang.StackOverflowError
        at java.util.TreeMap.put(TreeMap.java:568)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:44)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
        at org.jenkinsci.plugins.prometheus.util.FlowNodes.traverseTree(FlowNodes.java:49)
...

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:3
  • Comments:22 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
cpitstick-argocommented, Nov 6, 2019

My first guess here is that the issue here is that the traversal of the FlowNodes is using a recursive algorithm. By moving it to an iterative algorithm (which is hard to derive from first principles but fortunately we have Google), my guess is one of two things will happen:

  1. The traversal runs out of stack space simply because companies with larger Jenkins deployments have more FlowNodes than there is stack memory, and moving it to iterative moves the memory to the heap instead of the call stack, which will completely resolve the issue.

  2. The plugin will go into a truly infinite recursion, at which point we can then figure out why this is happening. Given that FlowNodes are a Jenkins construct and not one of this plugin, I’d be surprised if the tree is constructed badly.

I concede I don’t know exactly what FlowNodes or why this broke from 2.0.0 -> 2.0.6, but that’s my first guess.

1reaction
kisokucommented, Oct 10, 2019

I just ran into the same issue on Jenkins 2.199 and Prometheus 2.0.6. Was forced to downgrade the plugin to 2.0.0 as I would wind up with one cpu pegged and jenkins would no longer service requests

Read more comments on GitHub >

github_iconTop Results From Across the Web

Collection of alerting rules - Awesome Prometheus alerts
Alert thresholds depend on nature of applications. ... An exporter might be crashed. [copy] ... Users may be seeing delays in background processing....
Read more >
New Relic 5.7.1 crashes in swizzling method while ...
New Relic 5.7.1 crashes in swizzling method while HockeyApp is sending data on a background thread ... This crash was reported with HockeyApp,...
Read more >
Troubleshooting Sidekiq - GitLab Docs
Sidekiq is the background job processor GitLab uses to asynchronously run tasks. When things go wrong it can be difficult to troubleshoot.
Read more >
Switching back to the UI thread in WPF/UWP, in modern C#
In WinForms/WPF/UWP, you can only update controls from the UI thread. If you have code running in a background thread that needs to...
Read more >
Prometheus plugin broke with upgrade to 2.0.8 - Jenkins Jira
A thread (prometheus_async_worker thread/29837) died unexpectedly due to an uncaught exception, this may leave your Jenkins in a bad way and is usually ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found