Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

noobaa-db-pg pod when migrated, doesn't allow the new users or new buckets to be created

See original GitHub issue

Environment info

NooBaa Version: VERSION
Platform: Kubernetes 1.14.1 | minikube 1.1.1 | OpenShift 4.1 | other: specify

Noobaa version is RC code of the ODF 4.9.0

noobaa status INFO[0000] CLI version: 5.9.0 INFO[0000] noobaa-image: quay.io/rhceph-dev/mcg-core@sha256:6ce2ddee7aff6a0e768fce523a77c998e1e48e25d227f93843d195d65ebb81b9 INFO[0000] operator-image: quay.io/rhceph-dev/mcg-operator@sha256:cc293c7fe0fdfe3812f9d1af30b6f9c59e97d00c4727c4463a5b9d3429f4278e INFO[0000] noobaa-db-image: registry.redhat.io/rhel8/postgresql-12@sha256:b3e5b7bc6acd6422f928242d026171bcbed40ab644a2524c84e8ccb4b1ac48ff INFO[0000] Namespace: openshift-storage

oc version Client Version: 4.9.5 Server Version: 4.9.5 Kubernetes Version: v1.22.0-rc.0+a44d0f0

Actual behavior Note: This defect is created taking from the comments in the https://github.com/noobaa/noobaa-core/issues/6853

Node down scenario where the noobaa-db is running on a worker node and when it is shutdown the noobaa-db pod has to be migrated, should allow the new IO users and new IO can be spawned. It doesn’t seem to be the case.

Expected behavior

Steps to reproduce

`Basic Q here: When a node that is currently running noobaa-db-pg-0 is made down, then the noobaa-db-pg-0 has been moved to other worker node and got into Running state around 6min delay, however after that we can’t create

– new users – new buckets (using s3mb)

Step 1:

NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NOD E READINESS GATES noobaa-core-0 1/1 Running 0 20h 10.254.23.179 worker2.rkomandu-ta.cp.fyre.ibm.com <none> <none> noobaa-db-pg-0 1/1 Running 0 31m 10.254.12.12 worker1.rkomandu-ta.cp.fyre.ibm.com <none> <none>

Step 2: Made worker1 down where noobaa-db-pg-0 is running

worker0.rkomandu-ta.cp.fyre.ibm.com Ready worker 53d v1.22.0-rc.0+a44d0f0 worker1.rkomandu-ta.cp.fyre.ibm.com NotReady worker 53d v1.22.0-rc.0+a44d0f0 worker2.rkomandu-ta.cp.fyre.ibm.com Ready worker 53d v1.22.0-rc.0+a44d0f0

Step 3: noobaa-db-pg-0 moved to worker2 from worker1 noobaa-db-pg-0 0/1 Init:0/2 0 15s <none> worker2.rkomandu-ta.cp.fyre.ibm.com <none> <none>

 Step 4:  Still it is trying to get Initialized 
 noobaa-db-pg-0                                     0/1     Init:0/2   0             3m56s   <none>          worker2.rkomandu-ta.cp.fyre.ibm.com   <none>
 <none>
 
 Step 5:  After 6mXsec, it got into Running state 
 noobaa-db-pg-0                                     1/1     Running   0              91m     10.254.23.217   worker2.rkomandu-ta.cp.fyre.ibm.com   <none>
 <none>
 
 Step 6: Noobaa api to list_accounts just hangs 
 
 noobaa api account_api list_accounts {}

INFO[0000] ✅ Exists: NooBaa “noobaa” INFO[0000] ✅ Exists: Service “noobaa-mgmt” INFO[0000] ✅ Exists: Secret “noobaa-operator” INFO[0000] ✅ Exists: Secret “noobaa-admin” INFO[0000] ✈️ RPC: account.list_accounts() Request: map[] WARN[0000] RPC: GetConnection creating connection to wss://localhost:42325/rpc/ 0xc000a996d0 INFO[0000] RPC: Connecting websocket (0xc000a996d0) &{RPC:0xc0004bd130 Address:wss://localhost:42325/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:1 sema:0} ReconnectDelay:0s} INFO[0000] RPC: Connected websocket (0xc000a996d0) &{RPC:0xc0004bd130 Address:wss://localhost:42325/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:1 sema:0} ReconnectDelay:0s}

This is a bigger problem when we do any Failover testing the new user’s can’t be created , no new buckets can be created as well

Attaching must-gather logs

must-gather.local-noobaa-db-pg-0.tar.gz

Actual behavior

Expected behavior

Steps to reproduce

More information - Screenshots / Logs / Other output

Issue Analytics

State:
Created 2 years ago
Comments:39 (14 by maintainers)

Top GitHub Comments

1reaction

baumcommented, Feb 7, 2022

Archive.zip

@rkomandu, could you try the following procedure once noobaa-db comes into a working state, but the noobaa RPC API calls do not respond.

Please see the attached archive, it includes a simple RPC test, calling account.list_accounts() using the internal cluster address of noobaa-core.

The archive includes

rpc.js - test source
Dockerfile - building the toolbox image from the source above
toolbox.yaml - pod using the toolbox image

Sample run

➜  oc create -f toolbox.yaml
pod/toolbox created
➜  oc exec -ti toolbox -- bash
bash-4.4$ cd /root/node_modules/noobaa-core/src/test/rpc/
bash-4.4$ node rpc.js
load_config_local: NO LOCAL CONFIG
OpenSSL 1.1.1l  24 Aug 2021 setting up
init_rand_seed: starting ...
read_rand_seed: opening /dev/random ...
Feb-7 13:02:21.185 [/15]   [LOG] CONSOLE:: auth params  { email: 'admin@noobaa.io', password: 'LAHsRAwCRRDIrlgC4q3f0w==', system: 'noobaa' }
Feb-7 13:02:21.187 [/15]   [LOG] CONSOLE:: rpc url  http://10.98.31.25:8080
(node:15) Warning: Accessing non-existent property 'RpcError' of module exports inside circular dependency
(Use `node --trace-warnings ...` to show where the warning was created)
Feb-7 13:02:21.196 [/15]   [LOG] CONSOLE:: read_rand_seed: reading 32 bytes from /dev/random ...
(node:15) [DEP0066] DeprecationWarning: OutgoingMessage.prototype._headers is deprecated
Feb-7 13:02:21.208 [/15]   [LOG] CONSOLE:: read_rand_seed: got 32 bytes from /dev/random, total 32 ...
Feb-7 13:02:21.208 [/15]   [LOG] CONSOLE:: read_rand_seed: closing fd ...
Feb-7 13:02:21.209 [/15]   [LOG] CONSOLE:: init_rand_seed: seeding with 32 bytes
rand_seed: OpenSSL 1.1.1l  24 Aug 2021 seeding randomness
Feb-7 13:02:21.210 [/15]   [LOG] CONSOLE:: init_rand_seed: done
Feb-7 13:02:21.825 [/15]   [LOG] CONSOLE::
Feb-7 13:02:21.922 [/15]   [LOG] CONSOLE:: accounts  {
  accounts: [ { name: SENSITIVE-b8e720282050fed7, email: SENSITIVE-d4bc82999e444a8c, access_keys: [ { access_key: SENSITIVE-3bcd74c0d5fb9444, secret_key: SENSITIVE-6fce8fe68278e012 } ], has_login: false, has_s3_access: true, allowed_buckets: { full_permission: true }, default_resource: 'system-internal-storage-pool-61f2797668568e002a078531', can_create_buckets: true, systems: [ { name: 'noobaa', roles: [ 'operator' ] } ], external_connections: { count: 0, connections: [] }, preferences: { ui_theme: 'DARK' } }, { name: SENSITIVE-a2dee06b3ad853df, email: SENSITIVE-a2dee06b3ad853df, access_keys: [ { access_key: SENSITIVE-8eb41064e84395a9, secret_key: SENSITIVE-57fac7497d77e859 } ], has_login: false, has_s3_access: true, allowed_buckets: { full_permission: false, permission_list: [ SENSITIVE-6f3b29f1f7bb9970 ] }, default_resource: 'noobaa-default-backing-store', can_create_buckets: false, bucket_claim_owner: SENSITIVE-6f3b29f1f7bb9970, systems: [ { name: 'noobaa', roles: [ 'admin' ] } ], external_connections: { count: 0, connections: [] }, preferences: { ui_theme: 'DARK' } }, { name: SENSITIVE-b8e720282050fed7, email: SENSITIVE-9cf0fd89409efef8, access_keys: [ { access_key: SENSITIVE-72c9bd3c1b81e8e0, secret_key: SENSITIVE-593f1fadf1c863ee } ], has_login: true, has_s3_access: true, allowed_buckets: { full_permission: true }, default_resource: 'noobaa-default-backing-store', can_create_buckets: true, systems: [ { name: 'noobaa', roles: [ 'admin' ] } ], external_connections: { count: 0, connections: [] }, preferences: { ui_theme: 'DARK' } }, { name: SENSITIVE-fdc12e472642f831, email: SENSITIVE-fdc12e472642f831, access_keys: [ { access_key: SENSITIVE-79f98fc256854ccf, secret_key: SENSITIVE-c729e136639d9471 } ], has_login: false, has_s3_access: true, allowed_buckets: { full_permission: false, permission_list: [ SENSITIVE-7338c355a2d6ab1b ] }, default_resource: 'noobaa-default-backing-store', can_create_buckets: false, bucket_claim_owner: SENSITIVE-7338c355a2d6ab1b, systems: [ { name: 'noobaa', roles: [ 'admin' ] } ], external_connections: { count: 0, connections: [] }, preferences: { ui_theme: 'DARK' } }, { name: SENSITIVE-732bc7447032b841, email: SENSITIVE-732bc7447032b841, access_keys: [ { access_key: SENSITIVE-b3786c6c15bf6932, secret_key: SENSITIVE-80d4b2c36d0ea7cb } ], has_login: false, has_s3_access: true, allowed_buckets: { full_permission: false, permission_list: [ SENSITIVE-bdfbaaeb3c14fe04 ] }, default_resource: 'noobaa-default-backing-store', can_create_buckets: false, bucket_claim_owner: SENSITIVE-bdfbaaeb3c14fe04, systems: [ { name: 'noobaa', roles: [ 'admin' ] } ], external_connections: { count: 0, connections: [] }, preferences: { ui_theme: 'DARK' } } ]
}

0reactions

rkomanducommented, Apr 7, 2022

There are updates from the CSI team on the known limitations and having the CSI attacher replicaset for now. It has been tested with CNSA 513, CSI (2.5.0) + ODF - 4.9.5-4 d/s builds + Latest DAS operator.

Noobaa-db pod node when go down, it would take approximately around 6m 30sec - 7m and in that interim, no new accounts,exports/buckets can be created. Post the noobaa-db pod coming into Running state, Business is as-usual.

For now closing this defect. There is an enhancement planned by CSI team for future releases.

Top Results From Across the Web

Controlling ownership of objects and disabling ACLs for your ...

Control ownership of new objects that are uploaded to your Amazon S3 bucket and disable access control lists (ACLs) for your bucket using...

Step 3: Create an Object Storage Bucket and Construct the ...

Step 3: Create an Object Storage Bucket and Construct the Storage URL (If Not Using the Application Migration Service) ; Click Compute to...

terraform-aws-modules/s3-bucket/aws

Sometimes you need to have a way to create S3 resources conditionally but Terraform does not allow to use count inside module block, ......

Buckets - Flux CD

A Bucket named minio-bucket is created, indicated by the .metadata.name field. The source-controller checks the object storage bucket every five minutes, ...

Transfer terabytes of data between AWS s3 buckets cross ...

While migrating our infrastructure to Kubernetes, we had multiple s3 buckets to migrate from our old infrastructure to the new one.