question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Healthcheck error after provision on Openshift 4.10

See original GitHub issue

Hello guys,

I’ve provisioned an instance, but it never starts. The log details [2] is not very clear, but I have the feeling that, according to this line [1], whenever the cluster tries to a health check in the pod in order to check its readiness, the component tries to check the health of its dependencies using the route. But, the route will never be available until the health check returns 200.

[1] https://github.com/cryostatio/cryostat/blob/ed9ff7e2d13da4d6c1d51a3325098e4169845295/src/main/java/io/cryostat/net/web/http/generic/HealthGetHandler.java#L120

[2]

WARNING: Exception thrown
java.io.IOException: io.vertx.core.http.impl.NoStackTraceTimeoutException: The timeout period of 5000ms has been exceeded while executing GET /api/health for server cryostat-sample-grafana-bookinfo.apps.cluster-dfkdw.dfkdw.sandbox1648.opentlc.com:443
at io.cryostat.net.web.http.generic.HealthGetHandler.lambda$checkUri$0(HealthGetHandler.java:156)
at io.vertx.ext.web.client.impl.HttpContext.handleFailure(HttpContext.java:309)
at io.vertx.ext.web.client.impl.HttpContext.execute(HttpContext.java:303)
at io.vertx.ext.web.client.impl.HttpContext.next(HttpContext.java:275)
at io.vertx.ext.web.client.impl.predicate.PredicateInterceptor.handle(PredicateInterceptor.java:70)
at io.vertx.ext.web.client.impl.predicate.PredicateInterceptor.handle(PredicateInterceptor.java:32)
at io.vertx.ext.web.client.impl.HttpContext.next(HttpContext.java:272)
at io.vertx.ext.web.client.impl.HttpContext.fire(HttpContext.java:282)
at io.vertx.ext.web.client.impl.HttpContext.fail(HttpContext.java:262)
at io.vertx.ext.web.client.impl.HttpContext.lambda$handleSendRequest$7(HttpContext.java:422)
at io.vertx.core.impl.FutureImpl.tryFail(FutureImpl.java:195)
at io.vertx.ext.web.client.impl.HttpContext.lambda$handleSendRequest$15(HttpContext.java:518)
at io.vertx.core.http.impl.HttpClientRequestBase.handleException(HttpClientRequestBase.java:133)
at io.vertx.core.http.impl.HttpClientRequestImpl.handleException(HttpClientRequestImpl.java:371)
at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.handleException(Http1xClientConnection.java:525)
at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.reset(Http1xClientConnection.java:377)
at io.vertx.core.http.impl.HttpClientRequestImpl.reset(HttpClientRequestImpl.java:294)
at io.vertx.core.http.impl.HttpClientRequestBase.handleTimeout(HttpClientRequestBase.java:195)
at io.vertx.core.http.impl.HttpClientRequestBase.lambda$setTimeout$0(HttpClientRequestBase.java:118)
at io.vertx.core.impl.VertxImpl$InternalTimerHandler.handle(VertxImpl.java:942)
at io.vertx.core.impl.VertxImpl$InternalTimerHandler.handle(VertxImpl.java:906)
at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:366)
at io.vertx.core.impl.EventLoopContext.execute(EventLoopContext.java:43)
at io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:229)
at io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:221)
at io.vertx.core.impl.VertxImpl$InternalTimerHandler.run(VertxImpl.java:932)
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: io.vertx.core.http.impl.NoStackTraceTimeoutException: The timeout period of 5000ms has been exceeded while executing GET /api/health for server cryostat-sample-grafana-bookinfo.apps.cluster:443

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
andrewazorescommented, May 5, 2022

Will leave this open until 2.1 is out and @mgohashi can verify the fix works. Thanks!

1reaction
ebaroncommented, May 5, 2022

Hi @mgohashi, in Cryostat 2.0 the health check is indeed using the Route URL. With the upcoming 2.1 release, this will be done using a host alias to the loopback address. I’m not sure why the health check is failing using the Route in your case, but at least in 2.1 this should be simplified with the health check traffic not leaving the pod.

We expect 2.1 to be available within the next couple weeks.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Monitoring application health by using health checks
A health check periodically performs diagnostics on a running container using any combination ... After a failure, the probe continues to examine the...
Read more >
Support OpenShift Container Platform 4.10
Gather the following to resolve Source-to-Image (S2I) issues: Source-to-Image diagnostic data. Application diagnostic data to investigate application failure.
Read more >
Your OpenShift Cluster, Health Checks, Insights and You
Health checks are either based on an existing bug and customer issues or on issues discovered during our testing. Often we also add...
Read more >
Chapter 11. Monitoring application health by using health ...
OpenShift Container Platform applications have a number of options to detect and handle ... After a failure, the probe continues to examine the...
Read more >
Post-installation configuration OpenShift Container Platform 4.10
Clean a host's disk contents before or after provisioning. ... For the public load balancer, port 6443 is open and the health check...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found