Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Expose Kubernetes client configuration options

See original GitHub issue

Proposed change

Expose a way to inject configuration options for the kubespawner.KubeSpawner.api object. Right now it appears to be instantiated without passing in args or kwargs (self.api = shared_client('CoreV1Api')).

Alternative options

Some configuration options can be set via environment variables, but others cannot. For instance, kubernetes.client.Configuration.get_default_copy().ssl_ca_cert

Who would use this feature?

Kubernetes clusters that use custom CA certs may be misconfigured such that the ca-certs provided to service accounts (/var/run/secrets/kubernetes.io/serviceaccounts/ca.crt inside the Hub pod) do not work with the Kubernetes API server. When Kubespawner attempts to initialize and list pods in the namespace, it raises a CERTIFICATE_VERIFY_FAILED error.

See https://discourse.jupyter.org/t/enabling-jh-to-use-an-additional-ca-cert-when-calling-private-k8s-api/6662 and discussions in Jupyerhub Gitter between me and @consideRatio on Mar 1 / Mar 2 2021.

It is worth noting that there’s probably a cluster administration solution to this problem. I do not know what needs to change on the cluster configuration side to provision working ca.crt files to the serviceaccount / accompany secrets. Either way, I think offering config options in Kubespawner and zero-to-jupyterhub config.yaml is worthwhile.

(Optional): Suggest a solution

I defer to @yuvipanda @consideRatio on how to implement this. From a user perspective, I would like to be able to mount my actual CA certs into the Pod and then tell KubeSpawner to use them either via ENV variables or as a config option. Imagine I have a ca-certs Secret in my namespace with a ca.crt file that includes the k8s API ca’s, and I’m deploying jupyterhub with the zero-to-jupyterhub-k8s helm chart.

# config.yaml
hub:
  extraVolumes:
    - name: ca-crt
      secret:
        secretName: ca-certs
  extraVolumeMounts:
    - name: ca-crt
      mountPath: /etc/pki/certs/ca-certificate.crt
      subPath: ca.crt
  # NEW SYNTAX (?)
  kubeSpawner:
    ssl_ca_cert: /etc/pki/certs/ca-certificate.crt
  # ALTERNATIVELY
  extraEnv:
    KUBESPAWNER_CA_CERT: /etc/pki/certs/ca-certificate.crt

# kubespanwer.clients.py
import kubernetes.client

def shared_client(ClientType, *args, **kwargs):
    # caching stuff in here, not included for the sake of brevity
    # kubernetes.config.load_incluster_config() already called prior to invoking this function
    if 'KUBESPAWNER_CA_CERT' in os.environ:
        conf = kubernetes.client.Configuration.get_default_copy()
        conf.ssl_ca_cert = os.environ['KUBESPAWNER_CA_CERT']
        kubernetes.client.Configuration.set_default(conf)
    Client = getattr(kubernetes.client, ClientType)
    client = Client(*args, **kwargs)
    return client

Issue Analytics

State:
Created 3 years ago
Comments:13 (6 by maintainers)

Top GitHub Comments

1reaction

kafonekcommented, Mar 8, 2021

https://github.com/rook/rook/issues/3122#issuecomment-498320421 details a workaround using PodPreset for the intermediate CA problem if you can administer the cluster. We’re talking to our cluster admins about enabling the PodPreset admission controller. Our config would in theory look like -

apiVersion: settings.k8s.io/v1alpha1
kind: PodPreset
metadata:
  name: fix-ca-bundle
  namespace: mbs-jupyterhub
spec:
  selector:
    matchLabels:
      app: jupyterhub
      component: hub
  volumeMounts:
     mountPath: /run/secrets/kubernetes.io/serviceaccount/ca.crt
      name: ca-certs
      subPath: ca.crt
  volumes:
    - name: ca-certs
      configMap:
        name:  ca-certs

Still looking at a PR for you @consideRatio for cases when the cluster admin cannot or will not add the PodPreset controller.

1reaction

consideRatiocommented, Mar 5, 2021

Pathway to accepting this configuration

Also support configuring `host`

This is a tweaked example of what @kafonek suggested that also configures the host which likely goes hand in hand with the ssl_ca_cert config in many situations.

def shared_client(ClientType, *args, **kwargs):
    # ...
    if client is None:
        # configure the k8s client's network access to the k8s api-server
        if "KUBESPAWNER_K8S_SSL_CA_CERT" in os.environ:
            conf = kubernetes.client.Configuration.get_default_copy()
            conf.ssl_ca_cert = os.environ["KUBESPAWNER_K8S_SSL_CA_CERT"]
            kubernetes.client.Configuration.set_default(conf)
        if "KUBESPAWNER_K8S_HOST" in os.environ:
            conf = kubernetes.client.Configuration.get_default_copy()
            conf.ssl_ca_cert = os.environ["KUBESPAWNER_K8S_HOST"]
            kubernetes.client.Configuration.set_default(conf)

        Client = getattr(kubernetes.client, ClientType)
        client = Client(*args, **kwargs)
        # cache weakref so that clients can be garbage collected
        _client_cache[cache_key] = weakref.ref(client)
    return client

Configure with traitlets instead of env vars

I think I’d favor configuration of kubespawner using the typical configuration options (traitlets) instead of through environment variables.

Documentation about it

I’d like there to be a clear documentation with regards to if and when users should expect and want to configure this, and if by doing so they may end up with similar issues. I assume the need to configure this will lead to issues with many other k8s workloads deployed in a k8s cluster and would like to have that discussed.

What I’d like to avoid

I’ll describe what I’d like to avoid, and what I think is needed for this to be avoided, is some research to verify this is really the needed and there is no better way.

What I’d like to avoid in short is that users of KubeSpawners are led to believe that something make sense to do and we put in effort to support that path, but then realize it just isn’t sustainable and learns how to solve it properly in another way.

So in practice I want to avoid that we first let KubeSpawner, the user-scheduler, and the hook-image-awaiter configure this which requires documentation and maintenance only to later learn there was a better way both for end users of the JupyterHub Helm chart and the maintainers of the Helm chart, and that the users of these config options stopped looking for that solution instead because we provided this path to resolve it.

For my mind to be more at ease, I’d like as much as possible validation that this really is what makes sense to do generally. Perhaps other Helm charts have done it like this and clarified it was no way around it that made more sense?