"Attempted to roll forward to new ReplicaSet, but minimum number of Pods did not become live"
See original GitHub issueWhen deploying:
kubernetes:apps/v1:Deployment (web):
error: 2 errors occurred:
* the Kubernetes API server reported that "my-application/web-v845050y" failed to fully initialize or become live: 'web-v845050y' timed out waiting to be Ready
* Attempted to roll forward to new ReplicaSet, but minimum number of Pods did not become live
Expected behavior
A successful update to the Deployment object(s).
Current behavior
Pulumi times out waiting for the ReplicaSet/Deployment to stabilize even though the pods in question become Ready within 2 minutes.
If I watch the cluster I can see all of the pods (2) in the RS stand up and become Ready, and the ReplicaSet/Deployment reports them all as up + Ready/Up-to-Date inside of 2 minutes. This is happening for every Deployment I have configured on this cluster.
Tried latest @pulumi/kubernetes
module, I’m on latest Pulumi CLI binary, on EKS 1.19. I tried blowing everything up and redeploying. Nothing of note when describing the Deployment or ReplicaSet. It’s like the Pulumi client is just ignoring the state of the ReplicaSet.
Steps to reproduce
export class Microservice extends kube.apps.v1.Deployment {
readonly service: kube.core.v1.Service | undefined
readonly ports: kube.types.input.core.v1.ContainerPort[]
constructor(name: string, args: types.DefaultDeploymentArgs, opts: types.ResourceOpts) {
const secretVolumes = args.secretOpts
? manifest.makeSecretVolume(args.secretOpts.secretVolumeName, args.secretOpts.secretToMountName)
: []
const secretVolumeMounts = args.secretOpts
? manifest.makeSecretVolumeMount(args.secretOpts.secretVolumeName, args.secretOpts.secretMountPath)
: []
const containers = args.containers.map<types.Container>(container => ({
...container,
volumeMounts: [...(container.volumeMounts ?? args.volumeMounts ?? []), ...secretVolumeMounts]
}))
function createContainer(container: types.Container) {
return {
name: container.name,
image: container.image,
imagePullPolicy: 'Always',
env: container.env,
command: container.command,
args: container.args,
resources: manifest.defaultResources(container.resources),
ports: manifest.defaultContainerPorts(container.ports),
volumeMounts: container.volumeMounts,
livenessProbe: manifest.defaultLivenessProbe(container.livenessProbe),
readinessProbe: manifest.defaultReadinessProbe(container.readinessProbe),
workingDir: container.workingDir,
lifecycle: container.lifecycle
}
}
super(
name,
{
metadata: {
labels: args.labels ?? manifest.defaultLabel(name),
annotations: args.deploymentAnnotations,
namespace: args.namespace
},
spec: {
selector: { matchLabels: args.labels ?? manifest.defaultLabel(name) },
replicas: args.replicas,
strategy: args.deploymentStrategy,
template: {
metadata: {
labels: args.labels ?? manifest.defaultLabel(name),
annotations: {
...args.podAnnotations,
...utils.buildLoggingAnnotations(
args.enableDatadogLogs ?? false,
containers,
name,
args.datadogLogTags ?? []
)
}
},
spec: {
affinity: manifest.getPodAntiAZAffinity({
key: 'app',
operator: 'In',
values: [name]
}),
containers: containers.map(createContainer),
initContainers: (args.initContainers ?? []).map(createContainer),
volumes: [...(args.volumes ?? []), ...secretVolumes]
}
}
}
},
opts
)
this.ports = manifest.getContainerPorts(containers)
this.service =
this.ports.length > 0
? new kube.core.v1.Service(
name,
{
metadata: {
labels: args.labels ?? manifest.defaultLabel(name),
name: name,
namespace: args.namespace
},
spec: {
ports: manifest.defaultServicePorts(this.ports),
selector: args.labels ?? manifest.defaultLabel(name),
type: 'ClusterIP'
}
},
{ parent: this }
)
: undefined
if (args.scanImages) {
containers.forEach(async container => {
pulumi.output(container.image).apply(async v => {
const [name, tag] = v.split('/')[1].split(':')
await aws.ecr.tagProdImage(name, tag)
})
})
}
}
}
import * as kube from './module'
new kube.Microservice(
process.name,
{
containers: [
{
name: process.name,
image: ecrEndpoint,
livenessProbe: process.livenessProbe,
readinessProbe: process.readinessProbe,
ports: process.ports,
command: ['sh', '-c', `${process.command}`],
resources: process.resources,
env: [{ name: 'ENVIRONMENT', value: env }]
}
],
replicas: process.replicas,
enableDatadogLogs: true,
datadogLogTags: [process.name],
namespace: namespace,
deploymentStrategy: {
rollingUpdate: { maxSurge: '100%', maxUnavailable: '50%' }
}
},
{ provider: cluster.provider }
)
Context (Environment)
We’re trying to migrate this application to EKS from ECS. This is happening in our Staging environment. We can’t promote this app’s EKS manifestation to Production because we can’t reliably deploy it with the current CI flow.
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (4 by maintainers)
@viveklak Aha! Yes, you are correct it was an update of an existing deployment and was still using the older provider version. Subsequent deploys are now using
3.5.1
and are all succeeding, so this looks fixed. Thanks!Fixed by #1596