question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Mgr Timeout while deploying with ansible-playbook

See original GitHub issue

Bug Report What happened:

I am trying this for the first time. So I think there maybe some configuration error from my part. But currently I cant find any.

TASK [ceph-mgr : wait for all mgr to be up] **********************************************************************************************************************************************************
Friday 24 April 2020  17:41:38 +0530 (0:00:00.037)       0:06:02.798 ********** 
FAILED - RETRYING: wait for all mgr to be up (30 retries left).
FAILED - RETRYING: wait for all mgr to be up (29 retries left).
FAILED - RETRYING: wait for all mgr to be up (28 retries left).
FAILED - RETRYING: wait for all mgr to be up (27 retries left).
FAILED - RETRYING: wait for all mgr to be up (26 retries left).
FAILED - RETRYING: wait for all mgr to be up (25 retries left).
FAILED - RETRYING: wait for all mgr to be up (24 retries left).
FAILED - RETRYING: wait for all mgr to be up (23 retries left).
FAILED - RETRYING: wait for all mgr to be up (22 retries left).
FAILED - RETRYING: wait for all mgr to be up (21 retries left).
FAILED - RETRYING: wait for all mgr to be up (20 retries left).
FAILED - RETRYING: wait for all mgr to be up (19 retries left).
FAILED - RETRYING: wait for all mgr to be up (18 retries left).
FAILED - RETRYING: wait for all mgr to be up (17 retries left).
FAILED - RETRYING: wait for all mgr to be up (16 retries left).
FAILED - RETRYING: wait for all mgr to be up (15 retries left).
FAILED - RETRYING: wait for all mgr to be up (14 retries left).
FAILED - RETRYING: wait for all mgr to be up (13 retries left).
FAILED - RETRYING: wait for all mgr to be up (12 retries left).
FAILED - RETRYING: wait for all mgr to be up (11 retries left).
FAILED - RETRYING: wait for all mgr to be up (10 retries left).
FAILED - RETRYING: wait for all mgr to be up (9 retries left).
FAILED - RETRYING: wait for all mgr to be up (8 retries left).
FAILED - RETRYING: wait for all mgr to be up (7 retries left).
FAILED - RETRYING: wait for all mgr to be up (6 retries left).
FAILED - RETRYING: wait for all mgr to be up (5 retries left).
FAILED - RETRYING: wait for all mgr to be up (4 retries left).
FAILED - RETRYING: wait for all mgr to be up (3 retries left).
FAILED - RETRYING: wait for all mgr to be up (2 retries left).
FAILED - RETRYING: wait for all mgr to be up (1 retries left).
fatal: [root@10.70.59.138 -> root@10.70.59.138]: FAILED! => changed=false 
  attempts: 30
  cmd:
  - ceph
  - --cluster
  - ceph
  - mgr
  - dump
  - -f
  - json
  delta: '0:00:00.229637'
  end: '2020-04-24 17:45:18.220025'
  rc: 0
  start: '2020-04-24 17:45:17.990388'
  stderr: ''
  stderr_lines: <omitted>
  stdout: |2-
  
    {"epoch":1,"active_gid":0,"active_name":"","active_addrs":{"addrvec":[]},"active_addr":":/0","active_change":"0.000000","available":false,"standbys":[],"modules":["iostat","restful"],"available_modules":[],"services":{},"always_on_modules":{"nautilus":["balancer","crash","devicehealth","orchestrator_cli","progress","rbd_support","status","volumes"]}}
  stdout_lines: <omitted>

NO MORE HOSTS LEFT ***********************************************************************************************************************************************************************************

PLAY RECAP *******************************************************************************************************************************************************************************************
root@10.70.59.138          : ok=178  changed=14   unreachable=0    failed=1    skipped=289  rescued=0    ignored=0   
root@10.70.59.139          : ok=96   changed=7    unreachable=0    failed=0    skipped=212  rescued=0    ignored=0   
root@10.70.59.140          : ok=96   changed=7    unreachable=0    failed=0    skipped=212  rescued=0    ignored=0   


INSTALLER STATUS *************************************************************************************************************************************************************************************
Install Ceph Monitor           : Complete (0:02:25)
Install Ceph Manager           : In Progress (0:04:40)
	This phase can be restarted by running: roles/ceph-mgr/tasks/main.yml

What you expected to happen: This process to go smoothly

How to reproduce it (minimal and precise):

Share your group_vars files, inventory and full ceph-ansibe log

Environment:

  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Docker version if applicable (e.g. docker version):
  • Ansible version (e.g. ansible-playbook --version):
  • ceph-ansible version (e.g. git head or tag or stable branch):
  • Ceph version (e.g. ceph -v):

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:13 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
nizamial09-zzcommented, May 23, 2020

Not an issue now. It had something to do with the nodes I was using. Thanks @dsavineau

1reaction
dsavineaucommented, Apr 24, 2020

@nizamial09 the default ceph-ansible ansible.cfg set the ansible log file to $HOME/ansible/ansible.log [1] But if the directory doesn’t exist then you won’t find any log file.

[1] https://github.com/ceph/ceph-ansible/blob/master/ansible.cfg#L12

Read more comments on GitHub >

github_iconTop Results From Across the Web

1528960 – Add ability to change maximum timeout for Ansible ...
The default timeout for Ansible process executed from engine has been enlarged to 30 minutes, because especially upgrading hosts can take significant amount ......
Read more >
SSLError: ('The read operation timed out',) when trying to ...
So , I "solved" the problem. In fact , Ansbible has a timeout of 10 seconds for all ssh related command/read/write task.
Read more >
Deployment of IPI on BM using the Ansible Playbook
When deploying in an environment where subscription manager is not being used and a local repository is being setup on the provision host...
Read more >
Running Ansible Playbooks using EC2 Systems Manager Run ...
It also sets a timeout of 600 seconds. Conclusion. In this post, we've showed you how to use State Manager and Run Command...
Read more >
Working with network connection options
Setting timeout options . When communicating with a remote device, you have control over how long Ansible maintains the connection to that...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found