question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

OAuth 2.0 (keycloak) with self-signed certs causes kafka-kafka-{0,1,2} pods to crashloop backoff

See original GitHub issue

When configuring the operator with a Kafka object, the kafka pods crash during start up with

$ kubectl logs --previous -n strimzi-kafka kafka-kafka-0 | grep Caused 
Caused by: java.lang.RuntimeException: Failed to fetch public keys needed to validate JWT signatures: https://keycloak-[fqdn]/auth/realms/OIDC/protocol/openid-connect/certs
Caused by: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

While some information has been redacted, all hostnames that include [fqdn] are resolvable in DNS using that hostname. In places labeled [sensitive], an internal project name has been redacted, but all related names are valid for where they are used.

Expected behavior Kafka to be able to fetch the public keys required for JWT validation.

Environment (please complete the following information):

  • Strimzi version: 0.28.0 (kafka 3.1.0)
  • Installation method: Operator installed via helm
  • Kubernetes cluster: v1.21.5-eks-9017834
  • Infrastructure: Amazon EKS
  • keycloak 15.0.2
  • istio 1.10.1

A bit more background:

Our environment has a root self-signed cert managed by an external entity. They have provided our kubernetes cluster with an intermediate cert, which we’ve loaded into cert-manager. Using cert-manager, we’ve created certificates for Keycloak and Kafka. Keycloak and istio are already deployed into the cluster. Keycloak is protected by istio requiring mutual TLS to access the pod using the internal service name. We’re using terraform to create the strimzi-kafka namespace, copy the keycloak certificates from the istio-system namespace into the strimzi-kafka namespace and, deploy strimzi-kafka. (We recognize that this doesn’t handle certificate renewals and recognize that we will have to examine some mechanism for secret replication between namespaces.) While it shouldn’t impact this scenario, istio is configured with a gateway for all of the kafka hostnames (kafka, kafka-0, kafka-1, and kafka-2) with tls.mode = PASSTHROUGH, so kafka is terminating inbound connections. The keycloak URL provided to kafka is the virtualservice name provided to keycloak, meaning it’s accessible via the intranet in use for this cluster. We’ve confirmed within another pod within the cluster that the pods can access the external url for keycloak without issue.

YAML files and logs

Our kafka configuration:

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: kafka
  namespace: strimzi-kafka
spec:
  entityOperator:
    topicOperator: {}
    userOperator: {}
  kafka:
    authorization:
      clientId: oidc-client
      delegateToKafkaAcls: true
      superUsers:
      - User:service-account-oidc-client
      tokenEndpointUri: https://keycloak-[fqdn]/auth/realms/OIDC/protocol/openid-connect/token
      type: keycloak
    config:
      log.message.format.version: "2.8"
      offsets.topic.replication.factor: 1
      transaction.state.log.min.isr: 1
      transaction.state.log.replication.factor: 1
    listeners:
    - authentication:
        jwksEndpointUri: https://keycloak-[fqdn]/auth/realms/OIDC/protocol/openid-connect/certs
        maxSecondsWithoutReauthentication: 3600
        tlsTrustedCertificates:
        - certificate: ca.crt
          secretName: keycloak-[sensitive]-cert
        type: oauth
        userNameClaim: preferred_username
        validIssuerUri: https://keycloak-[fqdn]/auth/realms/OIDC
      configuration:
        brokerCertChainAndKey:
          certificate: tls.crt
          key: tls.key
          secretName: kafka-external-cert
        brokers:
        - advertisedHost: kafka-0.[fqdn]
          broker: 0
        - advertisedHost: kafka-1.[fqdn]
          broker: 1
        - advertisedHost: kafka-2.[fqdn]
          broker: 2
      name: external
      port: 9094
      tls: true
      type: internal
    - authentication:
        jwksEndpointUri: https://keycloak-[fqdn]/auth/realms/OIDC/protocol/openid-connect/certs
        maxSecondsWithoutReauthentication: 3600
        type: oauth
        userNameClaim: preferred_username
        validIssuerUri: https://keycloak-[fqdn]/auth/realms/OIDC
      configuration:
        brokerCertChainAndKey:
          certificate: tls.crt
          key: tls.key
          secretName: kafka-internal-cert
      name: internal
      port: 9093
      tls: true
      type: internal
    - authentication:
        jwksEndpointUri: https://keycloak-[fqdn]/auth/realms/OIDC/protocol/openid-connect/certs
        maxSecondsWithoutReauthentication: 3600
        type: oauth
        userNameClaim: preferred_username
        validIssuerUri: https://keycloak-[fqdn]/auth/realms/OIDC
      name: plain
      port: 9092
      tls: false
      type: internal
    logging:
      loggers:
        log4j.logger.io.strimzi: DEBUG
        log4j.logger.kafka: DEBUG
        log4j.logger.org.apache.kafka: DEBUG
      type: inline
    replicas: 3
    storage:
      class: gp2-encrypted
      deleteClaim: true
      size: 15Gi
      type: persistent-claim
  zookeeper:
    replicas: 3
    storage:
      class: gp2-encrypted
      deleteClaim: true
      size: 100Gi
      type: persistent-claim

Inside another pod running in the cluster, which happens to have curl installed, using the value of the “ca.crt” secret that is stored in the keycloak-[sensitive]-cert tls secret, it is possible to obtain a jwt successfully using curl, like this:

curl -s -X POST https://keycloak-[fqdn]/auth/realms/OIDC/protocol/openid-connect/token --cacert /tmp/tmp.zyYoYth7Ks -H 'Content-Type: application/x-www-form-urlencoded' -d client_secret=9478[sensitive]2a22 -d grant_type=client_credentials -d client_id=oidc-client
{"access_token":"eyJhbGciOiJSUzI1...R7h4Yw","expires_in":7200,"refresh_expires_in":0,"token_type":"Bearer","not-before-policy":0,"scope":"profile email"}

Kafka start up fails, logs attached.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
anthony-zawackicommented, Jun 22, 2022

We couldn’t see the forest for the trees. This will be an integration point for applications outside of the cluster, and the internal paths won’t be used at all. We’ve added the ca cert to all of the other listeners and it did in fact start kafka properly. Thank you @scholzj !

0reactions
anthony-zawackicommented, Jul 12, 2022

It did turn out to be that. It took a while to confirm, but we now have everything working in our “sandbox” environment with SSL connections between all components and everything working properly. Thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

OAuth2 Proxy pod keeps crashing when used with Keycloak ...
It appears that the oauth2-proxy 's pod cannot reach the service via the ingress. Would really appreciate any sort of help here. nginx ......
Read more >
Adding authentication to your Kubernetes Web applications ...
In this article, I will walk you through the deployment of Keycloak, a user authentication and authorization tool and how to integrate this ......
Read more >
Keycloak IAM deployed into Kubernetes cluster for OAuth2/OIDC
For purposes of keeping this article simple, let's use a self-signed certificate. # fqdn of exposed service prefix=keycloak.kubeadm.local ...
Read more >
PowerAI Vision pods do not start - ICP installation - IBM
A new deployment of PowerAI Vision fails to start. Multiple pods crash or fail to initialize. This is due to file ownership of...
Read more >
Installing Keycloak with self signed certificate - YouTube
(1/3) Kubernetes with Keycloak - Installing Keycloak with self signed certificate. Watch later. Share. Copy link.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found