Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Disaster Recovery in BAF

See original GitHub issue

Describe the bug As a part of DR testing, I was trying to recover the BAF deployments but could not recover BAF(fabric) from a kubernetes level failure.

To Reproduce Steps to reproduce the behavior:

Install fabric network using BAF with one orderer and two organizations(1 peer)
After successful deployment of BAF, change the kubernetes configuration for all the organizations in the network.yaml to another working kubernetes cluster.
Run the deploy-network.yaml
deploy-network.yaml fails with the below error

An exception occurred during task execution. To see the full traceback, use -vvv. The error was: If you are using a module and expect the file to exist on the remote, see the remote_src option
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Could not find or access './build/crypto-config/peerOrganizations/org1-net/peers/peer0.org1-net/msp/cacerts/ca-org1-net-7054.pem'\nSearched in:\n\t/home/blockchain-automation-framework/platforms/hyperledger-fabric/configuration/roles/create/crypto/peer/files/./build/crypto-config/peerOrganizations/org1-net/peers/peer0.org1-net/msp/cacerts/ca-org1-net-7054.pem\n\t/home/blockchain-automation-framework/platforms/hyperledger-fabric/configuration/roles/create/crypto/peer/./build/crypto-config/peerOrganizations/org1-net/peers/peer0.org1-net/msp/cacerts/ca-org1-net-7054.pem\n\t/home/blockchain-automation-framework/platforms/hyperledger-fabric/configuration/roles/create/crypto/peer/tasks/files/./build/crypto-config/peerOrganizations/org1-net/peers/peer0.org1-net/msp/cacerts/ca-org1-net-7054.pem\n\t/home/blockchain-automation-framework/platforms/hyperledger-fabric/configuration/roles/create/crypto/peer/tasks/./build/crypto-config/peerOrganizations/org1-net/peers/peer0.org1-net/msp/cacerts/ca-org1-net-7054.pem\n\t/home/blockchain-automation-framework/platforms/shared/configuration/../../hyperledger-fabric/configuration/files/./build/crypto-config/peerOrganizations/org1-net/peers/peer0.org1-net/msp/cacerts/ca-org1-net-7054.pem\n\t/home/blockchain-automation-framework/platforms/shared/configuration/../../hyperledger-fabric/configuration/./build/crypto-config/peerOrganizations/org1-net/peers/peer0.org1-net/msp/cacerts/ca-org1-net-7054.pem on the Ansible Controller.\nIf you are using a module and expect the file to exist on the remote, see the remote_src option"}

PLAY RECAP *******************************************************************************************************************************************************************************************************************
localhost                  : ok=309  changed=99   unreachable=0    failed=1    skipped=435  rescued=0    ignored=0

Expected behavior The deployment of the network should go through.

Screenshots If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS: From docker container
Version [e.g. 22]
Cloud environment: AKS
K8S Version: 1.17.11

Additional context I tried the above test on the same kubernetes cluster the network was deployed on by deleting namespace of an organization (ca, ca-tools, peer, pvc, services…etc are deleted). When I run the deploy-network.yaml after deleting the namespace, I get the beow error

Getting secrets from Vault Server: http://vault-test.eastus.azurecontainer.io:8200
{ "errors": [ "permission denied" ] }
ERROR: unable to retrieve vault login token: {
  "errors": [
    "permission denied"
  ]
}

Issue Analytics

State:
Created 3 years ago
Comments:9 (9 by maintainers)

Top GitHub Comments

1reaction

sivaramskcommented, Nov 11, 2020

Awesome. On a related note, I was also testing BAF with velero backup and restore. I hit a bug with velero restore which got fixed recently - https://github.com/vmware-tanzu/velero/issues/3027. I would test velero approach as well, and will document it.

0reactions

abeverscommented, Nov 23, 2020

Hi @sivaramsk, my scenario is not exactly the same. The scenario I have researched is a complete shutdown of the cluster (scaling deployments to 0) and then the same cluster restarting - without re-running the network.yaml. I think your suggestion of pointing to a new cluster for 1 or more organizations is also very valid. I’ll discuss this with the team.

Top Results From Across the Web

Brodie Assistance Fund 2021 Guidelines - Nlets

The Brodie Assistance Fund (BAF) is a fund available to Nlets Representatives and ... Complete the BAF Application ... Disaster Recovery Efforts with...

Business continuity and disaster recovery - Microsoft Learn

Effective business continuity and disaster recovery (BCDR) design provides platform-level capabilities that meet these requirements.

How to Implement A Disaster Recovery Plan to Protect Your ...

Steps to implement a disaster recovery plan · 1. Establish a response team · 2. Define the level of severity · 3. Deploy...

Bill Anderson Fund on Twitter: "The BAF is dedicated to ...

The BAF is dedicated to improving disaster preparedness, response, and recovery in marginalised communities by connecting students and early ...

Disaster Recovery Setup - Chef Software

Steps to setup the Production and Disaster Recovery Cluster · Deploy the Primary cluster following the deployment instructions by clicking here.