question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot access KFServing InferenceService through Ingress IAP GCP (Error 503).

See original GitHub issue

/kind bug

What steps did you take and what happened: Cannot access KFServing InferenceService through Ingress (Error 503).

I deployed an instance of Kubeflow (v1.3.0) to GCP accompanied by an KFServing InferenceService. I am unable to access the InferenceService external to the cluster.

The InferenceService is a custom model that was deployed following the custom model tutorial: https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1alpha2/custom/kfserving-custom-model. The manifest:

apiVersion: "serving.kubeflow.org/v1beta1"
kind: InferenceService
metadata:
  annotations:
    sidecar.istio.io/inject: "false"
  name: { Model }
spec:
  predictor:
    containers:
      - name: { Model }
        image: { My Image }

I was able to verify the above deployment by exec-ing into the deployment and submitting a requests.

kubectl exec -it { Model }-predictor-default-00002-deployment-... -- /bin/bash

curl -X POST http://127.0.0.1:8080/v1/models/{ Model }:predict -d '{"instances": [...]}'
>>> {"predictions": "..."}

After verifying the deployment, I followed the GCP IAP guide for KFserving (https://github.com/kubeflow/kfserving/tree/master/docs/samples/gcp-iap). I applied an Istio VirtualService manifest to route external traffic from the ingress gateway to the deployment.

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: vs-iap
spec:
  gateways:
  - kubeflow/kubeflow-gateway
  hosts:
  - '*'
  http:
  - match:
    - uri:
        prefix: /kfserving/default/{ Model }
    route:
    - destination:
        ## knative-local-gateway is in the `knative-serving` namespace.
        ## Attempted to make the request with both of the following.
        # host: knative-local-gateway.knative-serving.svc.cluster.local
        host: knative-local-gateway.istio-system.svc.cluster.local
      headers:
        request:
          set:
            Host: { Model }-predictor-default.default.{ Cluster Name }.endpoints.{ Project }.cloud.goog
      weight: 100
    rewrite:
        uri: /v1/models/{ Model }
    timeout: 300s

Using the following example, I submit a request to the cluster: https://github.com/kubeflow/kubeflow/blob/master/docs/gke/iap_request.py

Request

python3 iap_request.py https://{ Cluster }.endpoints.{ Project }.cloud.goog/kfserving/default/{ Model }:predict xxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com --input=./input.json

Response

Bad response from application: 503 / {'date': 'Fri, 11 Jun 2021 04:03:34 GMT', 'server': 'istio-envoy', 'content-length': '0', 'Via': '1.1 google', 'Alt-Svc': 'clear'} / ''

What did you expect to happen:

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.] During setup I followed the prerequisites (https://github.com/kubeflow/kfserving#prerequisites).

  • Istio 1.9.0 required: Istio version is 1.9.3.
  • Knative is 0.22.0: No cluster-local-gateway setup needed (knative-local-gateway has replaced cluster-local-gateway)
  • Cert Manager: Applied script.
  • Applied the inference service config kubectl apply -f https://raw.githubusercontent.com/kubeflow/kfserving/master/config/configmap/inferenceservice.yaml

Related to:

The solution in issue 1199 appears to be out of date due to the deprecation of the cluster-local-gateway.

Any help resolving this would be greatly appreciated.

Environment:

  • Istio Version: 1.9.3
  • Knative Version: v0.22.0
  • KFServing Version: v0.5.1 (According to the kubeflow docs https://www.kubeflow.org/docs/components/kfserving/kfserving/)
  • Kubeflow version: 1.3.0
  • Kfdef:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
  • Minikube version:
  • Kubernetes version: (use kubectl version): 1.18.17-gke.1901
  • OS (e.g. from /etc/os-release):

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:24 (14 by maintainers)

github_iconTop GitHub Comments

1reaction
zijianjoycommented, Oct 6, 2021

@edi-bice-by This is actually expected that you see istio-ingressgateway as the matching pods, and you don’t see knative-local-gateway deployment.

You can try the Release candidate for Kubeflow on Google Cloud in https://github.com/kubeflow/gcp-blueprints/releases/tag/v1.4.0-rc.0. You can follow this documentation (PR in progress) to test out Kubeflow 1.4: https://deploy-preview-2957--competent-brattain-de2d6d.netlify.app/docs/distributions/gke/deploy/deploy-cli/.

1reaction
austynhermancommented, Aug 16, 2021

@zijianjoy That seems like it has worked. I also needed to change the namespace of the knative-local-gateway. The knative-local-gateway is deployed to the knative-serving namespace instead of the istio-system namespace.

I still need to attempt to deploy a model and make a request. Will update when I have done that.

Thanks again for all the help. @yuzisun @zijianjoy

Read more comments on GitHub >

github_iconTop Results From Across the Web

Enabling IAP for GKE | Identity-Aware Proxy - Google Cloud
IAP is integrated through Ingress for GKE. This integration enables you to control resource-level access for employees instead of using a VPN.
Read more >
GKE: Identity-aware proxy > L7 load balancer > Custom host ...
I am using GKE Identity-aware proxy > L7 load balancer > Custom host and path rules. It works fine for the root-path. But...
Read more >
IAP on GKE - Gabriel Hodoroaga
For this demo we will deploy an nginx application on a GKE cluster, expose the app using a service with a IAP activate...
Read more >
SSH-in-browser stopped working. Connection via Clo...
after working for years, my SSH-in-browser no longer connects. I get: Code: 1006 Please ensure you can make a proper https connection to...
Read more >
Login to GCP VM Instance without Public IP using Identity ...
Identity-Aware Proxy (IAP) TCP forwarding to enable administrative access to ... To resolve this error we need to create firewall rule that allows...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found