question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Noobaa endpoint pod Terminated before new one comes in Running state

See original GitHub issue

Environment info

  • NooBaa Version: ODF 4.9.0 - 236.ci
  • Platform: OCP 4.9

Actual behavior

  1. The endpoint pods goes into terminating state

Expected behavior

  1. The old endpoint should come to Terminating after new ones in running state

Steps to reproduce

  1. Changed noobaa log level
[root@ocp-akshat-1-inf akshat]# oc get pod
NAME                                               READY   STATUS    RESTARTS   AGE
noobaa-core-0                                      1/1     Running   0          6m45s
noobaa-db-pg-0                                     1/1     Running   0          6m45s
noobaa-default-backing-store-noobaa-pod-41376b30   1/1     Running   0          4m46s
noobaa-endpoint-6b98668666-wkjvq                   1/1     Running   0          45s
noobaa-operator-67c57b5464-wqcww                   1/1     Running   0          4h30m
ocs-metrics-exporter-6fc4cbfcb6-qc7vx              1/1     Running   0          4h30m
ocs-operator-75fc898954-9cqt5                      1/1     Running   0          4h30m
odf-console-7dc4779787-qv44z                       1/1     Running   0          4h30m
odf-operator-controller-manager-548c77868b-nb87q   2/2     Running   0          4h30m
rook-ceph-operator-c796c8c98-nqd8f                 1/1     Running   0          4h30m
[root@ocp-akshat-1-inf akshat]# kubectl edit configmap/noobaa-config
configmap/noobaa-config edited
[root@ocp-akshat-1-inf akshat]# 
[root@ocp-akshat-1-inf akshat]# 
[root@ocp-akshat-1-inf akshat]# oc get pod
NAME                                               READY   STATUS              RESTARTS   AGE
noobaa-core-0                                      1/1     Terminating         0          7m12s
noobaa-db-pg-0                                     1/1     Running             0          7m12s
noobaa-default-backing-store-noobaa-pod-41376b30   1/1     Running             0          5m13s
noobaa-endpoint-6b98668666-px86w                   0/1     ContainerCreating   0          2s    ----->>> This one       
noobaa-endpoint-6b98668666-wkjvq                   1/1     Terminating         0          72s
noobaa-operator-67c57b5464-wqcww                   1/1     Running             0          4h30m
ocs-metrics-exporter-6fc4cbfcb6-qc7vx              1/1     Running             0          4h30m
ocs-operator-75fc898954-9cqt5                      1/1     Running             0          4h30m
odf-console-7dc4779787-qv44z                       1/1     Running             0          4h31m
odf-operator-controller-manager-548c77868b-nb87q   2/2     Running             0          4h31m
rook-ceph-operator-c796c8c98-nqd8f                 1/1     Running             0          4h30m
[root@ocp-akshat-1-inf akshat]# 

More information - Screenshots / Logs / Other output

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
dannyzakencommented, Nov 23, 2021

@guymguym @akmithal so what I think happens here is that we explicitly delete the endpoint pods when the config map is changed so the pods will use the new envs. kuberentes does not automatically restart the pods on config maps changes, so in this case the strategy is not relevant

see here (line 1486):

https://github.com/noobaa/noobaa-operator/blob/207b8f5dbd0d1ee73363d10541a626a6a5eeb661/pkg/system/phase4_configuring.go#L1470-L1488

1reaction
akmithalcommented, Nov 23, 2021

One more observation: This behaviour is observed only with step “kubectl edit configmap/noobaa-config”.

When I edit deployment to change CPU/Mem resources, the new endpoint pod first comes into Running state and then the old one goes into Terminating( this is expected behaviour)

Read more comments on GitHub >

github_iconTop Results From Across the Web

1995271 – [GSS] noobaa-db-pg-0 Pod get in stuck ...
CrashLoopBackOff` state of `noobaa-db-pg-0` pod when enabling `hugepages` Previously, enabling `hugepages` on OpenShift Container Platform cluster caused ...
Read more >
noobaa-db-pg pod doesn't migrate when the node has ...
Looks like a CSI issue, let me explain why. According to the provided error: Multi-Attach error for volume "pvc-3e03cdb0-a374-4aed-bc3f-6e6f9ba74bca" Volume is ...
Read more >
OpenShift Disaster Recovery using Stretch Cluster
Note that before restoring service to the failed zone or nodes, there must be confirmation that all pods with persistent volumes have terminated...
Read more >
RedHat: RHSA-2022-1372:01 Important: Red Hat OpenShift ...
1991462 - helper pod runs with root privileges during Must-gather ... 2034003 - NooBaa endpoint pod Terminated before new one comes in ...
Read more >
Installing IBM Spectrum Scale DAS
Running the preceding step sets up the Red Hat OpenShift namespace for IBM ... Scale DAS endpoint pods for the management of IBM...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found