strimzi kafka failing after k8s version upgrade to 1.22
See original GitHub issueHi, I have been facing the same issue after upgrading k8s version from 1.21 to 1.22 last night below is deployement file of mine
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: noice-dev
namespace: strimzi-dev
spec:
kafka:
version: 3.1.0
replicas: 3
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: external
port: 9094
type: loadbalancer
tls: false
configuration:
bootstrap:
annotations:
cloud.google.com/load-balancer-type: "Internal"
brokers:
- broker: 0
annotations:
cloud.google.com/load-balancer-type: "Internal"
- broker: 1
annotations:
cloud.google.com/load-balancer-type: "Internal"
- broker: 2
annotations:
cloud.google.com/load-balancer-type: "Internal"
config:
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
default.replication.factor: 3
min.insync.replicas: 2
inter.broker.protocol.version: "3.1"
storage:
type: persistent-claim
size: 30Gi
deleteClaim: false
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
name: kafka-metrics
key: kafka-metrics-config.yml
zookeeper:
replicas: 3
storage:
type: persistent-claim
size: 30Gi
deleteClaim: false
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
name: kafka-metrics
key: zookeeper-metrics-config.yml
entityOperator:
topicOperator: {}
userOperator: {}
kafkaExporter:
topicRegex: ".*"
groupRegex: ".*"
NAME READY STATUS RESTARTS AGE
noice-dev-zookeeper-0 0/1 CrashLoopBackOff 21 (3m32s ago) 58m
noice-dev-zookeeper-1 0/1 CrashLoopBackOff 21 (2m50s ago) 58m
noice-dev-zookeeper-2 0/1 CrashLoopBackOff 21 (3m22s ago) 58m
strimzi-cluster-operator-585f6fd9d7-8x9fk 1/1 Running 0 59m
Cluster operator logs below
2022-07-08 14:00:30 INFO ClusterOperator:128 - Triggering periodic reconciliation for namespace strimzi-dev
2022-07-08 14:00:30 INFO AbstractOperator:226 - Reconciliation #31(timer) Kafka(strimzi-dev/noice-dev): Kafka noice-dev will be checked for creation or modification
2022-07-08 14:01:30 INFO AbstractOperator:373 - Reconciliation #31(timer) Kafka(strimzi-dev/noice-dev): Reconciliation is in progress
2022-07-08 14:02:30 INFO ClusterOperator:128 - Triggering periodic reconciliation for namespace strimzi-dev
2022-07-08 14:02:30 INFO AbstractOperator:373 - Reconciliation #31(timer) Kafka(strimzi-dev/noice-dev): Reconciliation is in progress
2022-07-08 14:03:30 INFO AbstractOperator:373 - Reconciliation #31(timer) Kafka(strimzi-dev/noice-dev): Reconciliation is in progress
2022-07-08 14:04:30 INFO ClusterOperator:128 - Triggering periodic reconciliation for namespace strimzi-dev
2022-07-08 14:04:30 INFO AbstractOperator:373 - Reconciliation #31(timer) Kafka(strimzi-dev/noice-dev): Reconciliation is in progress
2022-07-08 14:05:30 INFO AbstractOperator:373 - Reconciliation #31(timer) Kafka(strimzi-dev/noice-dev): Reconciliation is in progress
2022-07-08 14:05:31 ERROR Util:153 - Reconciliation #31(timer) Kafka(strimzi-dev/noice-dev): Exceeded timeout of 300000ms while waiting for Pods resource noice-dev-zookeeper-1 in namespace strimzi-dev to be ready
2022-07-08 14:05:31 ERROR Util:153 - Reconciliation #31(timer) Kafka(strimzi-dev/noice-dev): Exceeded timeout of 300000ms while waiting for Pods resource noice-dev-zookeeper-2 in namespace strimzi-dev to be ready
2022-07-08 14:05:31 ERROR Util:153 - Reconciliation #31(timer) Kafka(strimzi-dev/noice-dev): Exceeded timeout of 300000ms while waiting for Pods resource noice-dev-zookeeper-0 in namespace strimzi-dev to be ready
2022-07-08 14:05:31 ERROR AbstractOperator:247 - Reconciliation #31(timer) Kafka(strimzi-dev/noice-dev): createOrUpdate failed
io.strimzi.operator.common.operator.resource.TimeoutException: Exceeded timeout of 300000ms while waiting for Pods resource noice-dev-zookeeper-0 in namespace strimzi-dev to be ready
at io.strimzi.operator.common.Util$1.lambda$handle$1(Util.java:154) ~[io.strimzi.operator-common-0.29.0.jar:0.29.0]
at io.vertx.core.impl.future.FutureImpl$3.onFailure(FutureImpl.java:153) ~[io.vertx.vertx-core-4.2.4.jar:4.2.4]
at io.vertx.core.impl.future.FutureBase.lambda$emitFailure$1(FutureBase.java:69) ~[io.vertx.vertx-core-4.2.4.jar:4.2.4]
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) ~[io.netty.netty-common-4.1.77.Final.jar:4.1.77.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) ~[io.netty.netty-common-4.1.77.Final.jar:4.1.77.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[io.netty.netty-common-4.1.77.Final.jar:4.1.77.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:503) ~[io.netty.netty-transport-4.1.77.Final.jar:4.1.77.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:995) ~[io.netty.netty-common-4.1.77.Final.jar:4.1.77.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[io.netty.netty-common-4.1.77.Final.jar:4.1.77.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[io.netty.netty-common-4.1.77.Final.jar:4.1.77.Final]
at java.lang.Thread.run(Thread.java:829) ~[?:?]
2022-07-08 14:05:31 WARN AbstractOperator:532 - Reconciliation #31(timer) Kafka(strimzi-dev/noice-dev): Failed to reconcile
io.strimzi.operator.common.operator.resource.TimeoutException: Exceeded timeout of 300000ms while waiting for Pods resource noice-dev-zookeeper-0 in namespace strimzi-dev to be ready
at io.strimzi.operator.common.Util$1.lambda$handle$1(Util.java:154) ~[io.strimzi.operator-common-0.29.0.jar:0.29.0]
at io.vertx.core.impl.future.FutureImpl$3.onFailure(FutureImpl.java:153) ~[io.vertx.vertx-core-4.2.4.jar:4.2.4]
at io.vertx.core.impl.future.FutureBase.lambda$emitFailure$1(FutureBase.java:69) ~[io.vertx.vertx-core-4.2.4.jar:4.2.4]
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) ~[io.netty.netty-common-4.1.77.Final.jar:4.1.77.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) ~[io.netty.netty-common-4.1.77.Final.jar:4.1.77.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[io.netty.netty-common-4.1.77.Final.jar:4.1.77.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:503) ~[io.netty.netty-transport-4.1.77.Final.jar:4.1.77.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:995) ~[io.netty.netty-common-4.1.77.Final.jar:4.1.77.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[io.netty.netty-common-4.1.77.Final.jar:4.1.77.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[io.netty.netty-common-4.1.77.Final.jar:4.1.77.Final]
at java.lang.Thread.run(Thread.java:829) ~[?:?]
_Originally posted by @4sif-adnan in https://github.com/strimzi/strimzi-kafka-operator/issues/3971#issuecomment-1179030208_
Issue Analytics
- State:
- Created a year ago
- Comments:8 (4 by maintainers)
Top Results From Across the Web
Deploying and Upgrading (0.22.1) - Strimzi
Strimzi simplifies the process of running Apache Kafka in a Kubernetes cluster. This guide provides instructions on all the options available for deploying...
Read more >Deploying and Upgrading Strimzi
Upgrading the Cluster Operator returns Kafka version error; 11.4.2. ... You can install Strimzi on Kubernetes 1.19 and later in three ways.
Read more >Downloads - Strimzi
Operators Strimzi Kafka Bridge Strimzi OAuth Kafka versions Kuberne...
0.32.0 0.22.3 0.11.0 3.2.0, 3.2.1, 3.2.3, 3.3.1 1.19+
0.31.1 0.22.1 0.10.0 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.2.1,...
Read more >Configuring Strimzi (In Development)
If your cluster already has topics defined, you can scale clusters. Kafka version, which can be changed to a supported version by following...
Read more >Deploying and Upgrading (0.25.0) - Strimzi
Strimzi simplifies the process of running Apache Kafka in a Kubernetes cluster. ... Install the new CRDs manually after upgrading the Cluster Operator....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Adjusting the Kafka & ZooKeeper request, limit fixed the issue. Thanks for the help.
Well, 1.21 is a different situation -> without limits it depends where it gets scheduled, how many resources you have, if you have some limit ranges configured or not etc. If the ZooKeeper log does not show anything more, than it is hard to point it anywhere else because there is not much to work with.