question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question] ... Strimzi with RedHat Service mesh compatibility

See original GitHub issue

Hi, I’m runing an OpenShift cluster (4.5) and I’m using Strimzi community operator (V0.18.0 - running 3 Zookeepers and 3 Kafka) and RedHat OpenShift Service Mesh operator (Istio).

When running this setup without Istio - all the pods gets into “Running” and “Ready” state.

After adding the service mesh into the setup (each of the pods get another sidecar - istio-proxy), The ZooKeepers is running and ready but the Kafka is failing to start due to connection timeout to the Zookeeper.

2020-07-28 09:11:23,342 INFO Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn) [main-SendThread(localhost:2181)]
2020-07-28 09:11:23,343 INFO Socket connection established, initiating session, client: /127.0.0.1:37132, server: localhost/127.0.0.1:2181 (org.apache.zookeeper.ClientCnxn) [main-SendThread(localhost:2181)]
2020-07-28 09:11:23,485 INFO Session: 0x0 closed (org.apache.zookeeper.ZooKeeper) [main]
2020-07-28 09:11:23,485 INFO EventThread shut down for session: 0x0 (org.apache.zookeeper.ClientCnxn) [main-EventThread]
2020-07-28 09:11:23,488 INFO [ZooKeeperClient Kafka server] Closed. (kafka.zookeeper.ZooKeeperClient) [main]
2020-07-28 09:11:23,492 ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) [main]
kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING
	at kafka.zookeeper.ZooKeeperClient.$anonfun$waitUntilConnected$3(ZooKeeperClient.scala:262)
	at kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:258)
	at kafka.zookeeper.ZooKeeperClient.<init>(ZooKeeperClient.scala:119)
	at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1863)
	at kafka.server.KafkaServer.createZkClient$1(KafkaServer.scala:378)
	at kafka.server.KafkaServer.initZkClient(KafkaServer.scala:403)
	at kafka.server.KafkaServer.startup(KafkaServer.scala:210)
	at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44)
	at kafka.Kafka$.main(Kafka.scala:82)
	at kafka.Kafka.main(Kafka.scala)

I don’t see any errors in the tls sidecar.

After looking into Istio’s instructions regarding ZooKeepers we needed to add an environment variable to the zookeeper config. We didn’t find any way to do this via the strimzi operator so we tried to change it in the statefullset itself. Unfortunately it seems like the operator change this configuration back once we restart the pod. That lead us to remove the operator so it won’t change the configuration back.

My question is:

  1. How can we change the zookeeper config via the operator? is there any way to do that without the operator change it back?
  2. Is there any a workaround/integration with Strimzi and istio-proxy?
  3. Can we somehow disable the tls sidecar with the Strimzi configuration?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:14 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
ReggieCareycommented, Sep 2, 2021

I’m not entirely sure this is resolved to satisfaction. The value of a service mesh from a security standpoint is very high. Not sure why this fails to work. Worse, I’m surprised its not documented on your website. I’ve spent a huge amount of time getting to this particular bug and it would have been short circuited had you mentioned incompatibility with Istio.

  1. Document the incompatibility when trying to create a cluster - some place prominent on the getting started pages at a minimum and on operatorhub.io. 1.5) Have the operator examine the metadata of the target namespace to verify service mesh is disabled 1.5a) Edit pod templates to disable sidecar injection so that it can still be deployed into sidecar enabled namespaces
  2. Request that you resolve this problem so that clusters can be spun up in namespaces that have a service mesh enabled.
1reaction
scholzjcommented, Jul 30, 2020

Adding annotation should be fairly easy. You can do it in your Kafka CR using the template object: https://strimzi.io/docs/operators/latest/using.html#assembly-customizing-kubernetes-resources-str

A custom authoriser is a bit more complicated. You will need to add support for it to the api module and some other classes. It will also have to support the super.users option. You can see this PR as an example - this added the OPA authorization: https://github.com/strimzi/strimzi-kafka-operator/pull/3192 … and once you have that you can rebuild the images.

If the authorizer is something generally useful and open source, you could also open a proposal and it could be added directly to Strimzi.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Chapter 1. Service Mesh 2.x OpenShift Container Platform 4.10
This release of Red Hat OpenShift Service Mesh addresses Common Vulnerabilities and Exposures (CVEs), bug fixes, and is supported on OpenShift Container ...
Read more >
Service Mesh 2.x release notes - OpenShift Documentation
This release of Red Hat OpenShift Service Mesh addresses Common Vulnerabilities and Exposures (CVEs), bug fixes, and is supported on OpenShift Container ...
Read more >
Strimzi - Apache Kafka on Kubernetes
Strimzi provides a way to run an Apache Kafka cluster on Kubernetes in various deployment configurations. Use the Quick Starts to get started...
Read more >
strimzi istio - You.com | The Search Engine You Control
I have a question on how to configure the Strimzi Kafka Operator with an Istio Ingress Gateway to ... Strimzi with RedHat Service...
Read more >
OperatorHub.io | The registry for Kubernetes Operators
Anchore Engine - container image scanning service for policy-based security, ... to the questions: What microservices are part of my Istio service mesh...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found