KubernetesClientException is swallowed in LeaderElector
See original GitHub issueDescribe the bug
The above implementation will swallow the following KubernetesClientException
and then cause the next renew could not work properly until reach the deadline. This will be a serious problem when the K8s has multiple APIServer and the renewing one crashed. It seems that this is not an issue in the master branch because we also catch the KubernetesClientException
. https://github.com/fabric8io/kubernetes-client/blob/master/kubernetes-client-api/src/main/java/io/fabric8/kubernetes/client/extended/leaderelection/LeaderElector.java#L146
io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [ConfigMap] with name: [flink-example-statemachine-cluster-config-map] in namespace: [default] failed.
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:206) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:167) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:90) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.get(ConfigMapLock.java:55) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.tryAcquireOrRenew(LeaderElector.java:135) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.renew(LeaderElector.java:120) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$renewWithTimeout$1(LeaderElector.java:104) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:?]
at java.util.concurrent.FutureTask.run(Unknown Source) [?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?]
at java.lang.Thread.run(Unknown Source) [?:?]
Caused by: java.net.ConnectException: Failed to connect to /10.96.0.1:443
at org.apache.flink.kubernetes.shaded.okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:265) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.connection.RealConnection.connect(RealConnection.java:183) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.java:224) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.java:108) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.java:88) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.connection.Transmitter.newExchange(Transmitter.java:169) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:41) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:94) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:88) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:133) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.utils.TokenRefreshInterceptor.intercept(TokenRefreshInterceptor.java:42) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createApplicableInterceptors$6(HttpClientUtils.java:290) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:229) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.RealCall.execute(RealCall.java:81) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.retryWithExponentialBackoff(OperationSupport.java:589) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:558) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:488) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:470) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:831) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:201) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
... 12 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:?]
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source) ~[?:?]
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source) ~[?:?]
at java.net.AbstractPlainSocketImpl.connect(Unknown Source) ~[?:?]
at java.net.SocksSocketImpl.connect(Unknown Source) ~[?:?]
at java.net.Socket.connect(Unknown Source) ~[?:?]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.platform.Platform.connectSocket(Platform.java:130) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:263) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.connection.RealConnection.connect(RealConnection.java:183) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.java:224) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.java:108) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.java:88) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.connection.Transmitter.newExchange(Transmitter.java:169) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:41) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:94) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:88) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:133) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.utils.TokenRefreshInterceptor.intercept(TokenRefreshInterceptor.java:42) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createApplicableInterceptors$6(HttpClientUtils.java:290) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:229) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at org.apache.flink.kubernetes.shaded.okhttp3.RealCall.execute(RealCall.java:81) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.retryWithExponentialBackoff(OperationSupport.java:589) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:558) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:488) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:470) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:831) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:201) ~[flink-kubernetes-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
... 12 more
Fabric8 Kubernetes Client version
5.5.0
Steps to reproduce
- Configure the
lease-duration
andrenew-deadline
to 60s - Restart the apiserver in minikube via
docker restart {container-id}
- The apiserver will recover in 10s
- Get the logs
Renew deadline reached after 60 seconds while renewing lock
and leadership is revoked
Expected behavior
The leadership should not be revoked since the subsequent renew will succeed if apiserver recovered soon.
Runtime
minikube
Kubernetes API Server version
1.22.3@latest
Environment
Linux
Fabric8 Kubernetes Client Logs
No response
Additional context
No response
Issue Analytics
- State:
- Created a year ago
- Comments:9 (6 by maintainers)
Top Results From Across the Web
fabric8io/kubernetes-client v5.12.3 on GitHub - NewReleases.io
... Fix #4246: KubernetesClientException is swallowed in LeaderElector; Fix #4295: Configure SnakeYaml to ignore converting timestamps to Date objects.
Read more >Kubernetes Client Versions - Open Source Agenda
... Fix #4246: KubernetesClientException is swallowed in LeaderElector; Fix #4295: Configure SnakeYaml to ignore converting timestamps to Date objects ...
Read more >Last issues related to kubernetes-model - PullAnswer
KubernetesClientException is swallowed in LeaderElector. 0 Likes 4 Replies. © 2022 pullanswer.com - All rights reserved.
Read more >[Question] Different Resource Response depending on ...
KubernetesClientException is swallowed in LeaderElector, 9, 2022-07-04, 2022-12-04. Next crates.io release, 2, 2019-10-27, 2022-11-19.
Read more >A Velocity proxy plugin for Minecraft server discovery in k8s ...
... Fix #4246: KubernetesClientException is swallowed in LeaderElector; Fix #4295: Configure SnakeYaml to ignore converting timestamps to ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@wangyang0918 hopefully this issue fixed the problem. We will wait for the 1.16.0 version to be released and tested on production in order to be assured once for all.
Thanks.
I will prepare a PR soon.