OAuth 2.0 (keycloak) with self-signed certs causes kafka-kafka-{0,1,2} pods to crashloop backoff
See original GitHub issueWhen configuring the operator with a Kafka object, the kafka pods crash during start up with
$ kubectl logs --previous -n strimzi-kafka kafka-kafka-0 | grep Caused 
Caused by: java.lang.RuntimeException: Failed to fetch public keys needed to validate JWT signatures: https://keycloak-[fqdn]/auth/realms/OIDC/protocol/openid-connect/certs
Caused by: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
While some information has been redacted, all hostnames that include [fqdn] are resolvable in DNS using that hostname. In places labeled [sensitive], an internal project name has been redacted, but all related names are valid for where they are used.
Expected behavior Kafka to be able to fetch the public keys required for JWT validation.
Environment (please complete the following information):
- Strimzi version: 0.28.0 (kafka 3.1.0)
 - Installation method: Operator installed via helm
 - Kubernetes cluster: v1.21.5-eks-9017834
 - Infrastructure: Amazon EKS
 - keycloak 15.0.2
 - istio 1.10.1
 
A bit more background:
Our environment has a root self-signed cert managed by an external entity. They have provided our kubernetes cluster with an intermediate cert, which we’ve loaded into cert-manager. Using cert-manager, we’ve created certificates for Keycloak and Kafka. Keycloak and istio are already deployed into the cluster. Keycloak is protected by istio requiring mutual TLS to access the pod using the internal service name. We’re using terraform to create the strimzi-kafka namespace, copy the keycloak certificates from the istio-system namespace into the strimzi-kafka namespace and, deploy strimzi-kafka. (We recognize that this doesn’t handle certificate renewals and recognize that we will have to examine some mechanism for secret replication between namespaces.) While it shouldn’t impact this scenario, istio is configured with a gateway for all of the kafka hostnames (kafka, kafka-0, kafka-1, and kafka-2) with tls.mode = PASSTHROUGH, so kafka is terminating inbound connections. The keycloak URL provided to kafka is the virtualservice name provided to keycloak, meaning it’s accessible via the intranet in use for this cluster. We’ve confirmed within another pod within the cluster that the pods can access the external url for keycloak without issue.
YAML files and logs
Our kafka configuration:
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: kafka
  namespace: strimzi-kafka
spec:
  entityOperator:
    topicOperator: {}
    userOperator: {}
  kafka:
    authorization:
      clientId: oidc-client
      delegateToKafkaAcls: true
      superUsers:
      - User:service-account-oidc-client
      tokenEndpointUri: https://keycloak-[fqdn]/auth/realms/OIDC/protocol/openid-connect/token
      type: keycloak
    config:
      log.message.format.version: "2.8"
      offsets.topic.replication.factor: 1
      transaction.state.log.min.isr: 1
      transaction.state.log.replication.factor: 1
    listeners:
    - authentication:
        jwksEndpointUri: https://keycloak-[fqdn]/auth/realms/OIDC/protocol/openid-connect/certs
        maxSecondsWithoutReauthentication: 3600
        tlsTrustedCertificates:
        - certificate: ca.crt
          secretName: keycloak-[sensitive]-cert
        type: oauth
        userNameClaim: preferred_username
        validIssuerUri: https://keycloak-[fqdn]/auth/realms/OIDC
      configuration:
        brokerCertChainAndKey:
          certificate: tls.crt
          key: tls.key
          secretName: kafka-external-cert
        brokers:
        - advertisedHost: kafka-0.[fqdn]
          broker: 0
        - advertisedHost: kafka-1.[fqdn]
          broker: 1
        - advertisedHost: kafka-2.[fqdn]
          broker: 2
      name: external
      port: 9094
      tls: true
      type: internal
    - authentication:
        jwksEndpointUri: https://keycloak-[fqdn]/auth/realms/OIDC/protocol/openid-connect/certs
        maxSecondsWithoutReauthentication: 3600
        type: oauth
        userNameClaim: preferred_username
        validIssuerUri: https://keycloak-[fqdn]/auth/realms/OIDC
      configuration:
        brokerCertChainAndKey:
          certificate: tls.crt
          key: tls.key
          secretName: kafka-internal-cert
      name: internal
      port: 9093
      tls: true
      type: internal
    - authentication:
        jwksEndpointUri: https://keycloak-[fqdn]/auth/realms/OIDC/protocol/openid-connect/certs
        maxSecondsWithoutReauthentication: 3600
        type: oauth
        userNameClaim: preferred_username
        validIssuerUri: https://keycloak-[fqdn]/auth/realms/OIDC
      name: plain
      port: 9092
      tls: false
      type: internal
    logging:
      loggers:
        log4j.logger.io.strimzi: DEBUG
        log4j.logger.kafka: DEBUG
        log4j.logger.org.apache.kafka: DEBUG
      type: inline
    replicas: 3
    storage:
      class: gp2-encrypted
      deleteClaim: true
      size: 15Gi
      type: persistent-claim
  zookeeper:
    replicas: 3
    storage:
      class: gp2-encrypted
      deleteClaim: true
      size: 100Gi
      type: persistent-claim
Inside another pod running in the cluster, which happens to have curl installed, using the value of the “ca.crt” secret that is stored in the keycloak-[sensitive]-cert tls secret, it is possible to obtain a jwt successfully using curl, like this:
curl -s -X POST https://keycloak-[fqdn]/auth/realms/OIDC/protocol/openid-connect/token --cacert /tmp/tmp.zyYoYth7Ks -H 'Content-Type: application/x-www-form-urlencoded' -d client_secret=9478[sensitive]2a22 -d grant_type=client_credentials -d client_id=oidc-client
{"access_token":"eyJhbGciOiJSUzI1...R7h4Yw","expires_in":7200,"refresh_expires_in":0,"token_type":"Bearer","not-before-policy":0,"scope":"profile email"}
Kafka start up fails, logs attached.
Issue Analytics
- State:
 - Created a year ago
 - Comments:8 (3 by maintainers)
 

Top Related StackOverflow Question
We couldn’t see the forest for the trees. This will be an integration point for applications outside of the cluster, and the internal paths won’t be used at all. We’ve added the ca cert to all of the other listeners and it did in fact start kafka properly. Thank you @scholzj !
It did turn out to be that. It took a while to confirm, but we now have everything working in our “sandbox” environment with SSL connections between all components and everything working properly. Thanks!