Issues with KFServing as part of a Kubeflow install
See original GitHub issue/kind bug
I’m seeing some issues with the KFServing install that’s part of the ‘out-of-the-box’ Kubeflow install (0.7.1). As documented, this ‘should’ work without the need to install additional stuff: the KF 0.7.1 install includes istio, knative-serving, and installs the kfserving-controller-manager statefulset.
Not clear if the following probs are all related, so this bug might need to be factored out into several. It’s possible that some of these issues relate to knative-serving vs kubeflow gateways conflicting, as apparently there can be issues if the transport like http or https could go via either gateway. e.g.: https://github.com/istio/istio/issues/11509.
First, it looks like an inferenceservice
can only be deployed into the automatically-created kubeflow-<user>
namespace. Is this intended?
Otherwise, there’s this error:
Error from server: error when creating "kfserving-tf-flowers.yaml": admission webhook "inferenceservice.kfserving-webhook-server.validator" denied the request: Cannot create the Inferenceservice "flowers-sample" in namespace "kubeflow": the namespace lacks label "serving.kubeflow.org/inferenceservice: enabled"
Once it is deployed, the inferenceservice is showing Ready==False, and giving ‘Failed to reconcile predictor’ errors:
% kubectl describe inferenceservice flowers-sample -n kubeflow-amyu
Name: flowers-sample
Namespace: kubeflow-amyu
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"serving.kubeflow.org/v1alpha2","kind":"InferenceService","metadata":{"annotations":{},"name":"flowers-sample","namespace":"...
API Version: serving.kubeflow.org/v1alpha2
Kind: InferenceService
Metadata:
Creation Timestamp: 2020-01-09T02:55:25Z
Generation: 6
Resource Version: 7257
Self Link: /apis/serving.kubeflow.org/v1alpha2/namespaces/kubeflow-amyu/inferenceservices/flowers-sample
UID: 7c11596e-328b-11ea-a589-42010a800278
Spec:
Default:
Predictor:
Tensorflow:
Resources:
Limits:
Cpu: 1
Memory: 2Gi
Requests:
Cpu: 1
Memory: 2Gi
Runtime Version: 1.14.0
Storage Uri: gs://kfserving-samples/models/tensorflow/flowers
Status:
Canary:
Conditions:
Last Transition Time: 2020-01-09T02:56:01Z
Message: Waiting for VirtualService to be ready
Reason: Uninitialized
Status: Unknown
Type: DefaultPredictorReady
Last Transition Time: 2020-01-09T02:55:25Z
Message: Failed to reconcile predictor
Reason: PredictorHostnameUnknown
Status: False
Type: Ready
Last Transition Time: 2020-01-09T02:55:25Z
Message: Failed to reconcile predictor
Reason: PredictorHostnameUnknown
Status: False
Type: RoutesReady
Default:
Predictor:
Name: flowers-sample-predictor-default-fjjkn
Events: <none>
Then, in this section of the instructions: https://github.com/kubeflow/kfserving/tree/master/docs/samples/tensorflow#run-a-prediction
…this command does not return a value: SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)
It’s not finding the status.url
. Here’s what the json looks like. What should it be returning? I’m guessing the SERVICE_HOSTNAME should be set to flowers-sample-predictor-default.kubeflow-amyu.svc.cluster.local
or similar, right? But perhaps due to the above issue I don’t see that string in the json below.
kubectl get inferenceservice ${MODEL_NAME} -o json
{
"apiVersion": "serving.kubeflow.org/v1alpha2",
"kind": "InferenceService",
"metadata": {
"annotations": {
"kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"serving.kubeflow.org/v1alpha2\",\"kind\":\"InferenceService\",\"metadata\":{\"annotations\":{},\"name\":\"flowers-sample\",\"namespace\":\"kubeflow\"},\"spec\":{\"default\":{\"predictor\":{\"tensorflow\":{\"storageUri\":\"gs://kfserving-samples/models/tensorflow/flowers\"}}}}}\n"
},
"creationTimestamp": "2020-01-08T21:17:24Z",
"generation": 5,
"name": "flowers-sample",
"namespace": "kubeflow",
"resourceVersion": "19958",
"selfLink": "/apis/serving.kubeflow.org/v1alpha2/namespaces/kubeflow/inferenceservices/flowers-sample",
"uid": "43c7f8c0-325c-11ea-bed8-42010a80015f"
},
"spec": {
"default": {
"predictor": {
"tensorflow": {
"resources": {
"limits": {
"cpu": "1",
"memory": "2Gi"
},
"requests": {
"cpu": "1",
"memory": "2Gi"
}
},
"runtimeVersion": "1.14.0",
"storageUri": "gs://kfserving-samples/models/tensorflow/flowers"
}
}
}
},
"status": {
"canary": {},
"conditions": [
{
"lastTransitionTime": "2020-01-08T21:17:58Z",
"message": "Waiting for VirtualService to be ready",
"reason": "Uninitialized",
"status": "Unknown",
"type": "DefaultPredictorReady"
},
{
"lastTransitionTime": "2020-01-08T21:17:24Z",
"message": "Failed to reconcile predictor",
"reason": "PredictorHostnameUnknown",
"status": "False",
"type": "Ready"
},
{
"lastTransitionTime": "2020-01-08T21:17:24Z",
"message": "Failed to reconcile predictor",
"reason": "PredictorHostnameUnknown",
"status": "False",
"type": "RoutesReady"
}
],
"default": {
"predictor": {
"name": "flowers-sample-predictor-default-jnp9v"
}
}
}
}
Finally, even if I deploy the KF install so that the istio-ingressgateway is set up with an exernal IP, I can’t successfully make an inference request by following the instructions. I get an origin auth failure.
% kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
35.192.166.45
% curl -v -H "Host: ${SERVICE_HOSTNAME}" http://35.192.166.45/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
* Trying 35.192.166.45...
* TCP_NODELAY set
* Connected to 35.192.166.45 (35.192.166.45) port 80 (#0)
> POST /v1/models/flowers-sample:predict HTTP/1.1
> Host: flowers-sample-predictor-default.kubeflow-amyu.svc.cluster.local
> User-Agent: curl/7.54.0
> Accept: */*
> Content-Length: 16201
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
< HTTP/1.1 401 Unauthorized
< content-length: 29
< content-type: text/plain
< date: Thu, 09 Jan 2020 17:30:36 GMT
< server: istio-envoy
< connection: close
<
* we are done reading and this is set to close, stop send
* Closing connection 0
Origin authentication failed.
(cc @jlewi as fyi)
Issue Analytics
- State:
- Created 4 years ago
- Reactions:10
- Comments:22 (10 by maintainers)
Top GitHub Comments
Thanks @janeman98 - those last two bullets helped out as the sample instructions did not work as is
@wronk Sorry for my late response. I don’t see your question until stomplee comment on this issue.
I install Kubeflow in minikube using
export CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v1.0-branch/kfdef/kfctl_k8s_istio.v1.0.0.yaml"
Since I don’t have load balancer in my env, I use
CLUSTER_IP=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.clusterIP}')
I don’t do anything special on
SERVICE_HOSTNAME
SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)
–> this will set SERVICE_HOSTNAME=flowers-sample.default.example.com