Protect KFServing endpoint
See original GitHub issue/kind feature
Currently, a kfserving-gateway
is used for kfserving inference endpoints. It uses LoadBalancer and all endpoints are publicly available without protection.
This doesn’t meet requirement for production grade cluster. We need a way to project these endpoints. Seems currently probe is a blocker to inference behind authn layer.
Describe the solution you’d like
Long term, I think there’re two options. (assume we can probe issue fixed)
-
Remove kfserving-gateway, reuse
istio-ingressgateway
which means requests from external need to get authenticated. I think both IAP (GCP), Coginito/OIDC (AWS) supports programmatic authentication. I am not sure about Dex. The advantage is this solution reuse AuthN and AuthZ from existing infra. -
Still have separate gateway for kfserving. Leave implementation to different vendors. User could build authentication on top of it. For example, AWS can replace service with ingress and LB level authentication for it. The reason we don’t reuse istio-gateway is because we can have other authentication strategy for kfserving. For example, each user can request different APIKeys for different models. etc.
-
Have a middleware for kfserving to manage API Keys. Not sure if there’s existing solution on Kubernetes. This sounds like very common use case.
Anything else you would like to add: Pipeline SDK will have similar issue, we can consider this together.
The user experience should be simple enough, client can get clientId and secret to refresh token or just use an assigned token to make call directly.
Solution needs to be latency optimized.
Related Issue: https://github.com/kubeflow/kfctl/issues/140 https://github.com/kubeflow/kubeflow/issues/4912
Issue Analytics
- State:
- Created 3 years ago
- Reactions:5
- Comments:29 (16 by maintainers)
Top GitHub Comments
I would definitely lean option 1. Our default install should provide a single gateway that is configured correctly for kubeflow. Multiple gateways creates confusion and potential security holes. Option 3 seems like a non-starter. I’d prefer that we aren’t in the business of authentication. Curious to hear what others thing.
@jlewi @ellis-bigelow @animeshsingh @Jeffwan @cliveseldon @krishnadurai I have tested the knative probe fix with GCP/IAP, I am able to get KFServing working e2e with https://github.com/kubeflow/manifests/pull/1137, however KFServing as is does not work out of the box for following issues.
sidecar.istio.io/inject: false
on KFServing inference service, seems kubeflow tfserving example does the same trick.As discussed in last WG meeting, @cliveseldon will help test this probe fix with Istio/Dex kfdef and @Jeffwan can help on aws kfdef. I am unsure how Istio/Dex will work with KFServing since it does not support programatic token, @krishnadurai @yanniszark might have some ideas.