Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

db pod does not reschedule on non-tainted node

See original GitHub issue

Environment info NooBaa Version: master-20210802 Platform: OCP 4.7..4 Actual behavior DB pod does not get scheduled on non-tainted node rather stays in terminating state on tainted node Expected behavior Db pod should get scheduled on non-tainted node

Steps to reproduce

Created PVC gpfs-vol-pvc-31
Created namespacestore using command:
noobaa namespacestore create nsfs fs2 --pvc-name='gpfs-vol-pvc-31' --fs-backend='GPFS'```
Currently, the pods are sheduled as below

```[root@api.osculate.cp.fyre.ibm.com ~]# oc get pod -o wide
NAME                                               READY   STATUS        RESTARTS   AGE   IP              NODE                               NOMINATED NODE   READINESS GATES
noobaa-core-0                                      1/1     Running       0          25m   10.254.17.153   worker1.osculate.cp.fyre.ibm.com   <none>           <none>
noobaa-db-pg-0                                     1/1     Running       0          45m   10.254.17.123   worker1.osculate.cp.fyre.ibm.com   <none>           <none>
noobaa-default-backing-store-noobaa-pod-cf4b02ee   0/1     Terminating   0          8s    <none>          worker2.osculate.cp.fyre.ibm.com   <none>           <none>
noobaa-endpoint-b67f8c458-wdgbw                    1/1     Running       0          25m   10.254.17.157   worker1.osculate.cp.fyre.ibm.com   <none>           <none>
noobaa-operator-7bb746749d-bd4sz                   1/1     Running       1          25m   10.254.17.145   worker1.osculate.cp.fyre.ibm.com   <none>           <none>

Taint Node 1 using:

kubectl taint nodes worker1.osculate.cp.fyre.ibm.com key1=value1:NoExecute

Now DB pod will come in terminating state on Node 1 only and will not get rescheduled on other node

NAME                               READY   STATUS              RESTARTS   AGE   IP              NODE                               NOMINATED NODE   READINESS GATES
noobaa-core-0                      1/1     Running             0          44s   10.254.21.162   worker2.osculate.cp.fyre.ibm.com   <none>           <none>
noobaa-db-pg-0                     0/1     Terminating         0          48m   10.254.17.123   worker1.osculate.cp.fyre.ibm.com   <none>           <none>
noobaa-endpoint-b67f8c458-gw7qm    0/1     ContainerCreating   0          83s   <none>          worker2.osculate.cp.fyre.ibm.com   <none>           <none>
noobaa-endpoint-b67f8c458-wdgbw    0/1     Terminating         0          28m   10.254.17.157   worker1.osculate.cp.fyre.ibm.com   <none>           <none>
noobaa-operator-7bb746749d-2jj88   1/1     Running             0          82s   10.254.21.150   worker2.osculate.cp.fyre.ibm.com   <none>           <none>

Note: As soon as we untaint node 1, DB pod will be in running state on Node 1 only

node/worker1.osculate.cp.fyre.ibm.com untainted
[root@api.osculate.cp.fyre.ibm.com ~]# oc get pod -o wide
NAME                                               READY   STATUS        RESTARTS   AGE     IP              NODE                               NOMINATED NODE   READINESS GATES
noobaa-core-0                                      1/1     Running       0          5m2s    10.254.21.162   worker2.osculate.cp.fyre.ibm.com   <none>           <none>
noobaa-db-pg-0                                     1/1     Running       0          52s     10.254.17.160   worker1.osculate.cp.fyre.ibm.com   <none>           <none>
noobaa-default-backing-store-noobaa-pod-cf4b02ee   0/1     Terminating   0          1s      <none>          worker1.osculate.cp.fyre.ibm.com   <none>           <none>
noobaa-endpoint-b67f8c458-gw7qm                    1/1     Running       0          5m41s   10.254.21.164   worker2.osculate.cp.fyre.ibm.com   <none>           <none>
noobaa-operator-7bb746749d-2jj88                   1/1     Running       0          5m40s   10.254.21.150   worker2.osculate.cp.fyre.ibm.com   <none>           <none>

Issue Analytics

State:
Created 2 years ago
Comments:16 (8 by maintainers)

Top GitHub Comments

1reaction

ketankhurana64commented, Aug 17, 2021

@nimrod-becker can you please add nsfs tag with it, i couldn’t achieve it while raising the defect

1reaction

ketankhurana64commented, Aug 6, 2021

I’ve the corresponding operator code installed

INFO[0001] noobaa-image: noobaa/noobaa-core:master-20210802
INFO[0001] operator-image: noobaa/noobaa-operator:master-20210802
INFO[0001] noobaa-db-image: centos/postgresql-12-centos7