In-cluster builds on EKS: credentials not found in native keychain
See original GitHub issueBug
Current Behavior
In-cluster builds (both Kaniko and cluster-buildkit) are failing on EKS using ECR when trying to build an image running garden build
with the following error:
[2022-11-17T15:24:37.385Z] Error: Unable to query registry for image status: time="2022-11-17T15:24:37Z" level=fatal msg="Error parsing image name \"docker://***********.dkr.ecr.us-east-1.amazonaws.com/****/repo/test:v-36b8e5e856\": getting username and password: 1 error occurred:\n\t* credentials not found in native keychain\n\n"
at skopeoBuildStatus (/snapshot/project/tmp/pkg/cli/node_modules/@garden-io/core/src/plugins/kubernetes/container/build/common.ts:263:13)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (internal/process/task_queues.js:95:5)
at /snapshot/project/tmp/pkg/cli/node_modules/@garden-io/core/src/actions.ts:1303:24
at ActionRouter.getBuildStatus (/snapshot/project/tmp/pkg/cli/node_modules/@garden-io/core/src/actions.ts:359:20)
at wrapped.process (/snapshot/project/tmp/pkg/cli/node_modules/@garden-io/core/src/tasks/build.ts:132:22)
at TaskNode.process (/snapshot/project/tmp/pkg/cli/node_modules/@garden-io/core/src/task-graph.ts:801:20)
at wrapped.processNode (/snapshot/project/tmp/pkg/cli/node_modules/@garden-io/core/src/task-graph.ts:436:18)
Error Details:
command:
- skopeo
- '--command-timeout=30s'
- inspect
- '--raw'
- '--authfile'
- /.docker/config.json
- >-
docker://***********.dkr.ecr.us-east-1.amazonaws.com/****/repo/test:v-36b8e5e856
output: >
time="2022-11-17T15:24:37Z" level=fatal msg="Error parsing image name
\"docker://***********.dkr.ecr.us-east-1.amazonaws.com/****/repo/test:v-36b8e5e856\":
getting username and password: 1 error occurred:\n\t* credentials not found in
native keychain\n\n"
[2022-11-17T15:24:37.400Z] Error: 1 build action(s) failed!
at handleProcessResults (/snapshot/project/tmp/pkg/cli/node_modules/@garden-io/core/src/commands/base.ts:532:19)
at BuildCommand.action (/snapshot/project/tmp/pkg/cli/node_modules/@garden-io/core/src/commands/build.ts:148:32)
at GardenCli.runCommand (/snapshot/project/tmp/pkg/cli/node_modules/@garden-io/core/src/cli/cli.ts:508:20)
at GardenCli.run (/snapshot/project/tmp/pkg/cli/node_modules/@garden-io/core/src/cli/cli.ts:667:26)
at Object.runCli (/snapshot/project/tmp/pkg/cli/src/cli.ts:41:14)
Error Details:
results:
build.test-image:
type: build
description: building test-image
key: build.test-image
name: test-image
error:
detail:
command:
- skopeo
- '--command-timeout=30s'
- inspect
- '--raw'
- '--authfile'
- /.docker/config.json
- >-
docker://***********.dkr.ecr.us-east-1.amazonaws.com/****/repo/test:v-36b8e5e856
output: >
time="2022-11-17T15:24:37Z" level=fatal msg="Error parsing image name
\"docker://***********.dkr.ecr.us-east-1.amazonaws.com/****/repo/test:v-36b8e5e856\":
getting username and password: 1 error occurred:\n\t* credentials not
found in native keychain\n\n"
type: runtime
startedAt: '2022-11-17T15:24:35.473Z'
completedAt: '2022-11-17T15:24:37.326Z'
batchId: b84dedc4-5292-47e7-ab98-a26b0a8fc485
version: v-36b8e5e856
My pod has IAM permissions to access ECR as it is being run on an instance that has the correct IAM role, and imagePullSecrets are set up according to instructions.
After some investigation, I found out that the error is being thrown by skopeo on the garden-utils pod, which is currently running a version of amazon-ecr-credential-helper (v0.4.0) that, for some reason, can’t load the instance credentials. I logged in as root into the garden-utils pod and updated amazon-ecr-credential-helper to 0.6.0 and skopeo now works as expected.
Expected behavior
The in-cluster build would work without errors.
Reproducible example
You can use the kaniko example setting it up on EKS using an ECR registry.
Workaround
There are no known workarounds.
Suggested solution(s)
Update garden-util’s (gardendev/k8s-util) image to use the latest version of amazon-ecr-credential-helper (0.6.0) so that it can load the credentials from the instance’s attached IAM role.
Additional context
Your environment
- OS: macOS Ventura 13.0
- How I’m running Kubernetes: EKS running Kubernetes 1.23
garden version
0.12.46
Issue Analytics
- State:
- Created 10 months ago
- Comments:21 (10 by maintainers)
@Walther @stefreak all working now! Just had the first build using Kaniko with no issues!
Thank you for your efforts in making this work. Happy to collaborate with this great tool in any other way you guys need.
@theoribeiro awesome 😃 Thanks for the offer, any collaboration+feedback is always appreciated! Feel free to reach out on our discord server and/or via github issues if you have any ideas https://discord.gg/gxeuDgp6Xt
I’m working on a more secure IAM setup using IRSA for in-cluster-building, if you want I can share the docs with you once it’s merged, would be very happy if you can try it out and give feedback. Hope I finish it within the next 7 days.