Webhook certificates validation fails
See original GitHub issue/kind bug
What steps did you take and what happened:
I installed the latest version of Katib by cloning the repo’s master
tree and running make deploy
against aour OpenShift 4.6.21 cluster.
Then I applied random-example.yaml.
Created experiment remains in Running condition, Trial’s pods are not updated with sidecar containers, `deployment/katib-controller’ shows logs with following lines:
2021/04/07 14:57:53 http: TLS handshake error from 10.254.2.1:47974: remote error: tls: bad certificate
2021/04/07 14:57:53 http: TLS handshake error from 10.254.2.1:47972: remote error: tls: bad certificate
What did you expect to happen: Webhook certificates are valid, Trial’s pods are injected with metric-gathering sidecars, Experiment successfully gathers metrics and progresses as it should.
Anything else you would like to add:
As a result of job/katib-cert-generator
WebhookConfiguration’s .webhooks[].clientConfig.caBundle
are updated with ca.crt
from katib-cert-generator-token
secret, assigned for the SA katib-cert-generator
.
According to documentation on CSR, ServiceAccount’s ca.crt
are not guaranteed to verify arbitrary client certificates:
None of these usages are related to ServiceAccount token secrets .data[ca.crt] in any way. That CA bundle is only guaranteed to verify a connection to the API server using the default service (kubernetes.default.svc).
I fetched tls.crt
from secret/katib-webhook-cert
and ca.crt
from secret/katib-cert-generator-token-***
, attached to the corresponding SA. Indeed, the pair is not valid:
[maanur@maanur-notebook katib-webhook-cert]$ openssl verify -verbose -CAfile ca.crt katib.crt
O = system:nodes, CN = system:node:katib-controller.kubeflow.svc
error 20 at 0 depth lookup: unable to get local issuer certificate
error katib.crt: verification failed
Environment:
- Katib version: 86884ca2c2ddbd317682ff771eb79c8bec014df5
- Kubeflow version: (not used)
- Kubernetes version: 1.19
- OpenShift version: 4.6.21
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
Thank you for creating this @maanur and tested Katib on OpenShift!
Please can you try to specify
kubernetes.io/legacy-unknown
signerName here: https://github.com/kubeflow/katib/blob/master/hack/cert-generator.sh#L82. Then, build and push your custom image for the cert generator:And use your custom image in the manifest: https://github.com/kubeflow/katib/blob/master/manifests/v1beta1/installs/katib-standalone/kustomization.yaml#L46.
My concern is that for OpenShift we need a different signerName. /cc @tenzen-y
This issue has been automatically closed because it has not had recent activity. Please comment “/reopen” to reopen it.