question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Issues with KFServing as part of a Kubeflow install

See original GitHub issue

/kind bug

I’m seeing some issues with the KFServing install that’s part of the ‘out-of-the-box’ Kubeflow install (0.7.1). As documented, this ‘should’ work without the need to install additional stuff: the KF 0.7.1 install includes istio, knative-serving, and installs the kfserving-controller-manager statefulset.

Not clear if the following probs are all related, so this bug might need to be factored out into several. It’s possible that some of these issues relate to knative-serving vs kubeflow gateways conflicting, as apparently there can be issues if the transport like http or https could go via either gateway. e.g.: https://github.com/istio/istio/issues/11509.

First, it looks like an inferenceservice can only be deployed into the automatically-created kubeflow-<user> namespace. Is this intended? Otherwise, there’s this error:

Error from server: error when creating "kfserving-tf-flowers.yaml": admission webhook "inferenceservice.kfserving-webhook-server.validator" denied the request: Cannot create the Inferenceservice "flowers-sample" in namespace "kubeflow": the namespace lacks label "serving.kubeflow.org/inferenceservice: enabled"

Once it is deployed, the inferenceservice is showing Ready==False, and giving ‘Failed to reconcile predictor’ errors:

% kubectl describe inferenceservice flowers-sample -n kubeflow-amyu
Name:         flowers-sample
Namespace:    kubeflow-amyu
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"serving.kubeflow.org/v1alpha2","kind":"InferenceService","metadata":{"annotations":{},"name":"flowers-sample","namespace":"...
API Version:  serving.kubeflow.org/v1alpha2
Kind:         InferenceService
Metadata:
  Creation Timestamp:  2020-01-09T02:55:25Z
  Generation:          6
  Resource Version:    7257
  Self Link:           /apis/serving.kubeflow.org/v1alpha2/namespaces/kubeflow-amyu/inferenceservices/flowers-sample
  UID:                 7c11596e-328b-11ea-a589-42010a800278
Spec:
  Default:
    Predictor:
      Tensorflow:
        Resources:
          Limits:
            Cpu:     1
            Memory:  2Gi
          Requests:
            Cpu:          1
            Memory:       2Gi
        Runtime Version:  1.14.0
        Storage Uri:      gs://kfserving-samples/models/tensorflow/flowers
Status:
  Canary:
  Conditions:
    Last Transition Time:  2020-01-09T02:56:01Z
    Message:               Waiting for VirtualService to be ready
    Reason:                Uninitialized
    Status:                Unknown
    Type:                  DefaultPredictorReady
    Last Transition Time:  2020-01-09T02:55:25Z
    Message:               Failed to reconcile predictor
    Reason:                PredictorHostnameUnknown
    Status:                False
    Type:                  Ready
    Last Transition Time:  2020-01-09T02:55:25Z
    Message:               Failed to reconcile predictor
    Reason:                PredictorHostnameUnknown
    Status:                False
    Type:                  RoutesReady
  Default:
    Predictor:
      Name:  flowers-sample-predictor-default-fjjkn
Events:      <none>

Then, in this section of the instructions: https://github.com/kubeflow/kfserving/tree/master/docs/samples/tensorflow#run-a-prediction …this command does not return a value: SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3) It’s not finding the status.url. Here’s what the json looks like. What should it be returning? I’m guessing the SERVICE_HOSTNAME should be set to flowers-sample-predictor-default.kubeflow-amyu.svc.cluster.local or similar, right? But perhaps due to the above issue I don’t see that string in the json below.

kubectl get inferenceservice ${MODEL_NAME} -o json
{
    "apiVersion": "serving.kubeflow.org/v1alpha2",
    "kind": "InferenceService",
    "metadata": {
        "annotations": {
            "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"serving.kubeflow.org/v1alpha2\",\"kind\":\"InferenceService\",\"metadata\":{\"annotations\":{},\"name\":\"flowers-sample\",\"namespace\":\"kubeflow\"},\"spec\":{\"default\":{\"predictor\":{\"tensorflow\":{\"storageUri\":\"gs://kfserving-samples/models/tensorflow/flowers\"}}}}}\n"
        },
        "creationTimestamp": "2020-01-08T21:17:24Z",
        "generation": 5,
        "name": "flowers-sample",
        "namespace": "kubeflow",
        "resourceVersion": "19958",
        "selfLink": "/apis/serving.kubeflow.org/v1alpha2/namespaces/kubeflow/inferenceservices/flowers-sample",
        "uid": "43c7f8c0-325c-11ea-bed8-42010a80015f"
    },
    "spec": {
        "default": {
            "predictor": {
                "tensorflow": {
                    "resources": {
                        "limits": {
                            "cpu": "1",
                            "memory": "2Gi"
                        },
                        "requests": {
                            "cpu": "1",
                            "memory": "2Gi"
                        }
                    },
                    "runtimeVersion": "1.14.0",
                    "storageUri": "gs://kfserving-samples/models/tensorflow/flowers"
                }
            }
        }
    },
    "status": {
        "canary": {},
        "conditions": [
            {
                "lastTransitionTime": "2020-01-08T21:17:58Z",
                "message": "Waiting for VirtualService to be ready",
                "reason": "Uninitialized",
                "status": "Unknown",
                "type": "DefaultPredictorReady"
            },
            {
                "lastTransitionTime": "2020-01-08T21:17:24Z",
                "message": "Failed to reconcile predictor",
                "reason": "PredictorHostnameUnknown",
                "status": "False",
                "type": "Ready"
            },
            {
                "lastTransitionTime": "2020-01-08T21:17:24Z",
                "message": "Failed to reconcile predictor",
                "reason": "PredictorHostnameUnknown",
                "status": "False",
                "type": "RoutesReady"
            }
        ],
        "default": {
            "predictor": {
                "name": "flowers-sample-predictor-default-jnp9v"
            }
        }
    }
}

Finally, even if I deploy the KF install so that the istio-ingressgateway is set up with an exernal IP, I can’t successfully make an inference request by following the instructions. I get an origin auth failure.

% kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}'  
35.192.166.45
% curl -v -H "Host: ${SERVICE_HOSTNAME}" http://35.192.166.45/v1/models/$MODEL_NAME:predict -d $INPUT_PATH 
*   Trying 35.192.166.45...
* TCP_NODELAY set
* Connected to 35.192.166.45 (35.192.166.45) port 80 (#0)
> POST /v1/models/flowers-sample:predict HTTP/1.1
> Host: flowers-sample-predictor-default.kubeflow-amyu.svc.cluster.local
> User-Agent: curl/7.54.0
> Accept: */*
> Content-Length: 16201
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
< HTTP/1.1 401 Unauthorized
< content-length: 29
< content-type: text/plain
< date: Thu, 09 Jan 2020 17:30:36 GMT
< server: istio-envoy
< connection: close
<
* we are done reading and this is set to close, stop send
* Closing connection 0
Origin authentication failed.

(cc @jlewi as fyi)

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:10
  • Comments:22 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
yantriks-edi-bicecommented, Mar 20, 2020

3. SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath=‘{.status.url}’ | cut -d “/” -f 3)

Thanks @janeman98 - those last two bullets helped out as the sample instructions did not work as is

1reaction
janeman98commented, Mar 12, 2020

@wronk Sorry for my late response. I don’t see your question until stomplee comment on this issue.

  1. I install Kubeflow in minikube using export CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v1.0-branch/kfdef/kfctl_k8s_istio.v1.0.0.yaml"

  2. Since I don’t have load balancer in my env, I use CLUSTER_IP=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.clusterIP}')

  3. I don’t do anything special on SERVICE_HOSTNAME SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3) –> this will set SERVICE_HOSTNAME=flowers-sample.default.example.com

Read more comments on GitHub >

github_iconTop Results From Across the Web

Issues with KFServing as part of a Kubeflow install #629
First, it looks like an inferenceservice can only be deployed into the automatically-created kubeflow-<user> namespace. Is this intended?
Read more >
KServe | Kubeflow
Highly scalable and standards based Model Inference Platform on Kubernetes for Trusted AI.
Read more >
KServe | Kubeflow
KServe enables serverless inferencing on Kubernetes and provides performant, high abstraction interfaces for common machine learning (ML) ...
Read more >
KFServing | Kubeflow
KFServing enables serverless inferencing on Kubernetes and provides performant, high abstraction interfaces for common machine learning (ML) ...
Read more >
Get Support | Kubeflow
This page describes the Kubeflow resources and support options that you can explore when you encounter a problem, have a question, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found