[backend] GCR images not accessible: `Consumer 'project:ml-pipeline' has been suspended.`
See original GitHub issueEnvironment
- How did you deploy Kubeflow Pipelines (KFP)?
Standalone installation with Kustomize on EKS.
- KFP version: 1.7.0
My KFP runs are failing and when I tried to pull this image: gcr.io/ml-pipeline/argoexec:v3.1.14-license-compliance
manually from another unrelated machine, that failed with the following error:
Error response from daemon: Head https://gcr.io/v2/ml-pipeline/argoexec/manifests/v3.1.14-license-compliance: denied: Permission denied: Consumer 'project:ml-pipeline' has been suspended.
I apologize if this is something from my side, but it seems serious if it is not.
Edit:
Suggestions
- Your jobs might not actually be failing. For us, the
main
container runs without issue. Only thewait
container with thegcr.io/ml-pipeline/argoexec:v3.1.14-license-compliance
image is stuck. This comment details how to deal with that temporarily. - If you have your core Kubeflow Pipelines components on machines that have the potential to roll. On spot instance, for example, then: a. Try freezing them if you can. For example, certain K8s scaling systems allow you to restrict scale down on certain node groups. b. Copy over the core images to one of your private container registries as soon as possible. After that, you should be able to update your Kubeflow Pipelines manifests with these private images.
Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.
Issue Analytics
- State:
- Created a year ago
- Reactions:13
- Comments:11 (1 by maintainers)
Top Results From Across the Web
Troubleshooting | Container Registry documentation
GKE returns an ImagePullBackoff error when it is unable to pull an image from a registry. The error might occur because the image...
Read more >containerd can't pull image from Github Docker Package ...
Using the new github docker registry containerd kubernetes can't pull image but using docker engine based k8s works fine.
Read more >Resolve the Amazon ECR error "CannotPullContainerError
Your AWS Identity and Access Management (IAM) role doesn't have the right permissions to pull or push images; The image can't be found;...
Read more >2 - Stack Overflow
Based on the message error: Consumer 'project:automl-vision-ondevice' has been suspended. Can you check if your project still available or ...
Read more >registry.k8s.io: faster, cheaper and Generally Available (GA)
Keep in mind that, eventually, you will have to switch to the new registry, as new image tags will no longer be pushed...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Could we please get a short RCA from the developers before we close this issue? I want to evaluate the risks of something like this repeating.
Apologize for the trouble, our project was mistakenly suspended for a short period due to some process error, and we’ll be working on a postmortem to prevent this from happening again.