noobaa-db-pg pod when migrated, doesn't allow the new users or new buckets to be created
See original GitHub issueEnvironment info
- NooBaa Version: VERSION
- Platform: Kubernetes 1.14.1 | minikube 1.1.1 | OpenShift 4.1 | other: specify
Noobaa version is RC code of the ODF 4.9.0
noobaa status INFO[0000] CLI version: 5.9.0 INFO[0000] noobaa-image: quay.io/rhceph-dev/mcg-core@sha256:6ce2ddee7aff6a0e768fce523a77c998e1e48e25d227f93843d195d65ebb81b9 INFO[0000] operator-image: quay.io/rhceph-dev/mcg-operator@sha256:cc293c7fe0fdfe3812f9d1af30b6f9c59e97d00c4727c4463a5b9d3429f4278e INFO[0000] noobaa-db-image: registry.redhat.io/rhel8/postgresql-12@sha256:b3e5b7bc6acd6422f928242d026171bcbed40ab644a2524c84e8ccb4b1ac48ff INFO[0000] Namespace: openshift-storage
oc version Client Version: 4.9.5 Server Version: 4.9.5 Kubernetes Version: v1.22.0-rc.0+a44d0f0
Actual behavior Note: This defect is created taking from the comments in the https://github.com/noobaa/noobaa-core/issues/6853
Node down scenario where the noobaa-db is running on a worker node and when it is shutdown the noobaa-db pod has to be migrated, should allow the new IO users and new IO can be spawned. It doesn’t seem to be the case.
Expected behavior
Steps to reproduce
`Basic Q here: When a node that is currently running noobaa-db-pg-0 is made down, then the noobaa-db-pg-0 has been moved to other worker node and got into Running state around 6min delay, however after that we can’t create
– new users – new buckets (using s3mb)
Step 1:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NOD E READINESS GATES noobaa-core-0 1/1 Running 0 20h 10.254.23.179 worker2.rkomandu-ta.cp.fyre.ibm.com <none> <none> noobaa-db-pg-0 1/1 Running 0 31m 10.254.12.12 worker1.rkomandu-ta.cp.fyre.ibm.com <none> <none>
Step 2: Made worker1 down where noobaa-db-pg-0 is running
worker0.rkomandu-ta.cp.fyre.ibm.com Ready worker 53d v1.22.0-rc.0+a44d0f0 worker1.rkomandu-ta.cp.fyre.ibm.com NotReady worker 53d v1.22.0-rc.0+a44d0f0 worker2.rkomandu-ta.cp.fyre.ibm.com Ready worker 53d v1.22.0-rc.0+a44d0f0
Step 3: noobaa-db-pg-0 moved to worker2 from worker1 noobaa-db-pg-0 0/1 Init:0/2 0 15s <none> worker2.rkomandu-ta.cp.fyre.ibm.com <none> <none>
Step 4: Still it is trying to get Initialized
noobaa-db-pg-0 0/1 Init:0/2 0 3m56s <none> worker2.rkomandu-ta.cp.fyre.ibm.com <none>
<none>
Step 5: After 6mXsec, it got into Running state
noobaa-db-pg-0 1/1 Running 0 91m 10.254.23.217 worker2.rkomandu-ta.cp.fyre.ibm.com <none>
<none>
Step 6: Noobaa api to list_accounts just hangs
noobaa api account_api list_accounts {}
INFO[0000] ✅ Exists: NooBaa “noobaa” INFO[0000] ✅ Exists: Service “noobaa-mgmt” INFO[0000] ✅ Exists: Secret “noobaa-operator” INFO[0000] ✅ Exists: Secret “noobaa-admin” INFO[0000] ✈️ RPC: account.list_accounts() Request: map[] WARN[0000] RPC: GetConnection creating connection to wss://localhost:42325/rpc/ 0xc000a996d0 INFO[0000] RPC: Connecting websocket (0xc000a996d0) &{RPC:0xc0004bd130 Address:wss://localhost:42325/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:1 sema:0} ReconnectDelay:0s} INFO[0000] RPC: Connected websocket (0xc000a996d0) &{RPC:0xc0004bd130 Address:wss://localhost:42325/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:1 sema:0} ReconnectDelay:0s}
`
This is a bigger problem when we do any Failover testing the new user’s can’t be created , no new buckets can be created as well
Attaching must-gather logs
must-gather.local-noobaa-db-pg-0.tar.gz
Actual behavior
Expected behavior
Steps to reproduce
More information - Screenshots / Logs / Other output
Issue Analytics
- State:
- Created 2 years ago
- Comments:39 (14 by maintainers)

Top Related StackOverflow Question
Archive.zip
@rkomandu, could you try the following procedure once noobaa-db comes into a working state, but the noobaa RPC API calls do not respond.
Please see the attached archive, it includes a simple RPC test, calling account.list_accounts() using the internal cluster address of noobaa-core.
The archive includes
Sample run
There are updates from the CSI team on the known limitations and having the CSI attacher replicaset for now. It has been tested with CNSA 513, CSI (2.5.0) + ODF - 4.9.5-4 d/s builds + Latest DAS operator.
Noobaa-db pod node when go down, it would take approximately around 6m 30sec - 7m and in that interim, no new accounts,exports/buckets can be created. Post the noobaa-db pod coming into Running state, Business is as-usual.
For now closing this defect. There is an enhancement planned by CSI team for future releases.