Noobaa endpoint pod Terminated before new one comes in Running state
See original GitHub issueEnvironment info
- NooBaa Version: ODF 4.9.0 - 236.ci
- Platform: OCP 4.9
Actual behavior
- The endpoint pods goes into terminating state
Expected behavior
- The old endpoint should come to Terminating after new ones in running state
Steps to reproduce
- Changed noobaa log level
[root@ocp-akshat-1-inf akshat]# oc get pod
NAME READY STATUS RESTARTS AGE
noobaa-core-0 1/1 Running 0 6m45s
noobaa-db-pg-0 1/1 Running 0 6m45s
noobaa-default-backing-store-noobaa-pod-41376b30 1/1 Running 0 4m46s
noobaa-endpoint-6b98668666-wkjvq 1/1 Running 0 45s
noobaa-operator-67c57b5464-wqcww 1/1 Running 0 4h30m
ocs-metrics-exporter-6fc4cbfcb6-qc7vx 1/1 Running 0 4h30m
ocs-operator-75fc898954-9cqt5 1/1 Running 0 4h30m
odf-console-7dc4779787-qv44z 1/1 Running 0 4h30m
odf-operator-controller-manager-548c77868b-nb87q 2/2 Running 0 4h30m
rook-ceph-operator-c796c8c98-nqd8f 1/1 Running 0 4h30m
[root@ocp-akshat-1-inf akshat]# kubectl edit configmap/noobaa-config
configmap/noobaa-config edited
[root@ocp-akshat-1-inf akshat]#
[root@ocp-akshat-1-inf akshat]#
[root@ocp-akshat-1-inf akshat]# oc get pod
NAME READY STATUS RESTARTS AGE
noobaa-core-0 1/1 Terminating 0 7m12s
noobaa-db-pg-0 1/1 Running 0 7m12s
noobaa-default-backing-store-noobaa-pod-41376b30 1/1 Running 0 5m13s
noobaa-endpoint-6b98668666-px86w 0/1 ContainerCreating 0 2s ----->>> This one
noobaa-endpoint-6b98668666-wkjvq 1/1 Terminating 0 72s
noobaa-operator-67c57b5464-wqcww 1/1 Running 0 4h30m
ocs-metrics-exporter-6fc4cbfcb6-qc7vx 1/1 Running 0 4h30m
ocs-operator-75fc898954-9cqt5 1/1 Running 0 4h30m
odf-console-7dc4779787-qv44z 1/1 Running 0 4h31m
odf-operator-controller-manager-548c77868b-nb87q 2/2 Running 0 4h31m
rook-ceph-operator-c796c8c98-nqd8f 1/1 Running 0 4h30m
[root@ocp-akshat-1-inf akshat]#
More information - Screenshots / Logs / Other output
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (5 by maintainers)
Top Results From Across the Web
1995271 – [GSS] noobaa-db-pg-0 Pod get in stuck ...
CrashLoopBackOff` state of `noobaa-db-pg-0` pod when enabling `hugepages` Previously, enabling `hugepages` on OpenShift Container Platform cluster caused ...
Read more >noobaa-db-pg pod doesn't migrate when the node has ...
Looks like a CSI issue, let me explain why. According to the provided error: Multi-Attach error for volume "pvc-3e03cdb0-a374-4aed-bc3f-6e6f9ba74bca" Volume is ...
Read more >OpenShift Disaster Recovery using Stretch Cluster
Note that before restoring service to the failed zone or nodes, there must be confirmation that all pods with persistent volumes have terminated...
Read more >RedHat: RHSA-2022-1372:01 Important: Red Hat OpenShift ...
1991462 - helper pod runs with root privileges during Must-gather ... 2034003 - NooBaa endpoint pod Terminated before new one comes in ...
Read more >Installing IBM Spectrum Scale DAS
Running the preceding step sets up the Red Hat OpenShift namespace for IBM ... Scale DAS endpoint pods for the management of IBM...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

@guymguym @akmithal so what I think happens here is that we explicitly delete the endpoint pods when the config map is changed so the pods will use the new envs. kuberentes does not automatically restart the pods on config maps changes, so in this case the strategy is not relevant
see here (line 1486):
https://github.com/noobaa/noobaa-operator/blob/207b8f5dbd0d1ee73363d10541a626a6a5eeb661/pkg/system/phase4_configuring.go#L1470-L1488
One more observation: This behaviour is observed only with step “kubectl edit configmap/noobaa-config”.
When I edit deployment to change CPU/Mem resources, the new endpoint pod first comes into Running state and then the old one goes into Terminating( this is expected behaviour)