Pod not starting when running income sample
See original GitHub issue/kind bug
What steps did you take and what happened: I follow the instructions for running the sample income
When applying the yaml file , I see errors Init:CrashLoopBackOff
+ errors in logs
kubectl create -f income.yaml
inferenceservice.serving.kubeflow.org/income created
kubectl get po
NAME READY STATUS RESTARTS AGE
income-explainer-default-bk42k-deployment-855569cfc8-r6j7x 0/3 Init:CrashLoopBackOff 3 74s
income-predictor-default-z5p4w-deployment-5d87f87f-b9vkd 0/3 Init:CrashLoopBackOff 3 73s
Below are the logs of a pod :
kubectl logs income-explainer-default-bk42k-deployment-855569cfc8-r6j7x -c storage-initializer
[I 200404 14:01:51 initializer-entrypoint:13] Initializing, args: src_uri [gs://seldon-models/sklearn/income/explainer] dest_path[ [/mnt/models]
[I 200404 14:01:51 storage:35] Copying contents of gs://seldon-models/sklearn/income/explainer to local
[I 200404 14:01:52 storage:111] Downloading: /mnt/models/explainer.dill
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 719, in download_to_file
transport, file_obj, download_url, headers, start, end, raw_download
File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 643, in _do_download
download.consume(transport)
File "/usr/local/lib/python3.7/site-packages/google/resumable_media/requests/download.py", line 153, in consume
self._process_response(result)
File "/usr/local/lib/python3.7/site-packages/google/resumable_media/_download.py", line 171, in _process_response
response, _ACCEPTABLE_STATUS_CODES, self._get_status_code
File "/usr/local/lib/python3.7/site-packages/google/resumable_media/_helpers.py", line 96, in require_status_code
*status_codes
google.resumable_media.common.InvalidResponse: ('Request failed with status code', 412, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/storage-initializer/scripts/initializer-entrypoint", line 14, in <module>
kfserving.Storage.download(src_uri, dest_path)
File "/usr/local/lib/python3.7/site-packages/kfserving/storage.py", line 48, in download
Storage._download_gcs(uri, out_dir)
File "/usr/local/lib/python3.7/site-packages/kfserving/storage.py", line 112, in _download_gcs
blob.download_to_filename(dest_path)
File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 759, in download_to_filename
raw_download=raw_download,
File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 722, in download_to_file
_raise_from_invalid_response(exc)
File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 2156, in _raise_from_invalid_response
raise exceptions.from_http_status(response.status_code, message, response=response)
google.api_core.exceptions.PreconditionFailed: 412 GET https://storage.googleapis.com/download/storage/v1/b/seldon-models/o/sklearn%2Fincome%2Fexplainer%2Fexplainer.dill?generation=1563992055026415&alt=media: ('Request failed with status code', 412, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
What did you expect to happen: Pods start
Anything else you would like to add: I have same issue when running the 2 other samples in the explanation/alibi folder (imagenet and moviesentiment)
Environment:
- Istio Version:
- Knative Version:
- KFServing Version:
- Kubeflow version: 1.0.1 in GCP
- Minikube version:
- Kubernetes version: (use
kubectl version
): - OS (e.g. from
/etc/os-release
):
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
Pod not starting when running income sample · Issue #762
This error happens for all examples (imagenet, income and moviesentiment) in the alibi folder when downloading from bucket gs://seldon-models/.
Read more >How to Debug Kubernetes Pending Pods and Scheduling ...
Learn how to debug Pending pods that fail to get scheduled due to resource constraints, taints, affinity rules, and other reasons.
Read more >Debug and log your Kubernetes applications - IBM Developer
Understand Kubernetes clusters better by learning about application logging and debugging steps, such as evaluating container logs and ...
Read more >Determine the Reason for Pod Failure - Kubernetes
If you are running a multi-container Pod, you can use a Go template to include the container's name. By doing so, you can...
Read more >How to troubleshoot pod-to-pod connectivity with Amazon EKS
Check if the pods are being scheduled and in a running state. If not, the problem can be in related to something else....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I ran into the same for a different model and was hoping this issue would provide some answers - how come it is closed?
google.api_core.exceptions.PreconditionFailed: 412 GET https://storage.googleapis.com/download/storage/v1/b/seldon-models/o/sklearn%2Fmoviesentiment%2Fmodel.joblib?generation=1567959830881500&alt=media: The operation requires that Uniform Bucket Level Access be enabled.: (‘Request failed with status code’, 412, ‘Expected one of’, <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
@yuzisun Thanks for looking at this issue
Hummm. Something looks weird. 412 Precondition Failed client error response code indicates that access to the target resource has been denied. This error happens for all examples (imagenet, income and moviesentiment) in the alibi folder when downloading from bucket gs://seldon-models/.
I tried a few times those examples and this error happens everytime. I try to execute in Kubeflow 1.0.1 installed in GCP.
Maybe I am wrong, but it looks that I don’t have rights to download in /mnt/models. But I don’t know how to check (and eventually modify) the rights because I use the images which are provided by the sample and I don’t know where is the Dockerfile.
I run the sample “as is”. I don’t know how /mnt/models is mounted in the container.