Healthcheck error after provision on Openshift 4.10
See original GitHub issueHello guys,
I’ve provisioned an instance, but it never starts. The log details [2] is not very clear, but I have the feeling that, according to this line [1], whenever the cluster tries to a health check in the pod in order to check its readiness, the component tries to check the health of its dependencies using the route. But, the route will never be available until the health check returns 200.
[2]
WARNING: Exception thrown
java.io.IOException: io.vertx.core.http.impl.NoStackTraceTimeoutException: The timeout period of 5000ms has been exceeded while executing GET /api/health for server cryostat-sample-grafana-bookinfo.apps.cluster-dfkdw.dfkdw.sandbox1648.opentlc.com:443
at io.cryostat.net.web.http.generic.HealthGetHandler.lambda$checkUri$0(HealthGetHandler.java:156)
at io.vertx.ext.web.client.impl.HttpContext.handleFailure(HttpContext.java:309)
at io.vertx.ext.web.client.impl.HttpContext.execute(HttpContext.java:303)
at io.vertx.ext.web.client.impl.HttpContext.next(HttpContext.java:275)
at io.vertx.ext.web.client.impl.predicate.PredicateInterceptor.handle(PredicateInterceptor.java:70)
at io.vertx.ext.web.client.impl.predicate.PredicateInterceptor.handle(PredicateInterceptor.java:32)
at io.vertx.ext.web.client.impl.HttpContext.next(HttpContext.java:272)
at io.vertx.ext.web.client.impl.HttpContext.fire(HttpContext.java:282)
at io.vertx.ext.web.client.impl.HttpContext.fail(HttpContext.java:262)
at io.vertx.ext.web.client.impl.HttpContext.lambda$handleSendRequest$7(HttpContext.java:422)
at io.vertx.core.impl.FutureImpl.tryFail(FutureImpl.java:195)
at io.vertx.ext.web.client.impl.HttpContext.lambda$handleSendRequest$15(HttpContext.java:518)
at io.vertx.core.http.impl.HttpClientRequestBase.handleException(HttpClientRequestBase.java:133)
at io.vertx.core.http.impl.HttpClientRequestImpl.handleException(HttpClientRequestImpl.java:371)
at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.handleException(Http1xClientConnection.java:525)
at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.reset(Http1xClientConnection.java:377)
at io.vertx.core.http.impl.HttpClientRequestImpl.reset(HttpClientRequestImpl.java:294)
at io.vertx.core.http.impl.HttpClientRequestBase.handleTimeout(HttpClientRequestBase.java:195)
at io.vertx.core.http.impl.HttpClientRequestBase.lambda$setTimeout$0(HttpClientRequestBase.java:118)
at io.vertx.core.impl.VertxImpl$InternalTimerHandler.handle(VertxImpl.java:942)
at io.vertx.core.impl.VertxImpl$InternalTimerHandler.handle(VertxImpl.java:906)
at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:366)
at io.vertx.core.impl.EventLoopContext.execute(EventLoopContext.java:43)
at io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:229)
at io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:221)
at io.vertx.core.impl.VertxImpl$InternalTimerHandler.run(VertxImpl.java:932)
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: io.vertx.core.http.impl.NoStackTraceTimeoutException: The timeout period of 5000ms has been exceeded while executing GET /api/health for server cryostat-sample-grafana-bookinfo.apps.cluster:443
Issue Analytics
- State:
- Created a year ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
Monitoring application health by using health checks
A health check periodically performs diagnostics on a running container using any combination ... After a failure, the probe continues to examine the...
Read more >Support OpenShift Container Platform 4.10
Gather the following to resolve Source-to-Image (S2I) issues: Source-to-Image diagnostic data. Application diagnostic data to investigate application failure.
Read more >Your OpenShift Cluster, Health Checks, Insights and You
Health checks are either based on an existing bug and customer issues or on issues discovered during our testing. Often we also add...
Read more >Chapter 11. Monitoring application health by using health ...
OpenShift Container Platform applications have a number of options to detect and handle ... After a failure, the probe continues to examine the...
Read more >Post-installation configuration OpenShift Container Platform 4.10
Clean a host's disk contents before or after provisioning. ... For the public load balancer, port 6443 is open and the health check...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Will leave this open until 2.1 is out and @mgohashi can verify the fix works. Thanks!
Hi @mgohashi, in Cryostat 2.0 the health check is indeed using the Route URL. With the upcoming 2.1 release, this will be done using a host alias to the loopback address. I’m not sure why the health check is failing using the Route in your case, but at least in 2.1 this should be simplified with the health check traffic not leaving the pod.
We expect 2.1 to be available within the next couple weeks.