question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ceph_purge-container-cluster.yml cannot be run multiple times.

See original GitHub issue

Hello,

In my configuration I have set up Logical Volume. When I deploy this it works fine. the problem is during the purge, I cannot run the purge role (ceph_purge-container-cluster.yml) more than once. I have the impression that this role is not idempotent. I would like to know if I am the only one having this problem?

Thank you for your help. regards

Bug Report

TASK [zap and destroy osds created by ceph-volume with lvm_volumes] ***************************************************************************************************************
Friday 12 February 2021  14:54:08 +0100 (0:00:00.136)       0:10:26.168 *******
The full traceback is:
Traceback (most recent call last):
  File "<stdin>", line 102, in <module>
  File "<stdin>", line 94, in _ansiballz_main
  File "<stdin>", line 40, in invoke_module
  File "/usr/lib64/python3.6/runpy.py", line 205, in run_module
    return _run_module_code(code, init_globals, run_name, mod_spec)
  File "/usr/lib64/python3.6/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 759, in <module>
  File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 755, in main
  File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 633, in run_module
  File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 475, in is_lv
  File "/usr/lib64/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
failed: [neocore6] (item={'data': 'LV_data_OSD1', 'crush_device_class': 'hdd', 'data_vg': 'vg_data_HDD_1', 'wal': 'LV_wal_OSD1', 'wal_vg': 'vg_data_SSD_1'}) => changed=false
  ansible_loop_var: item
  item:
    crush_device_class: hdd
    data: LV_data_OSD1
    data_vg: vg_data_HDD_1
    wal: LV_wal_OSD1
    wal_vg: vg_data_SSD_1
  module_stderr: |-
    Traceback (most recent call last):
      File "<stdin>", line 102, in <module>
      File "<stdin>", line 94, in _ansiballz_main
      File "<stdin>", line 40, in invoke_module
      File "/usr/lib64/python3.6/runpy.py", line 205, in run_module
        return _run_module_code(code, init_globals, run_name, mod_spec)
      File "/usr/lib64/python3.6/runpy.py", line 96, in _run_module_code
        mod_name, mod_spec, pkg_name, script_name)
      File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 759, in <module>
      File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 755, in main
      File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 633, in run_module
      File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 475, in is_lv
      File "/usr/lib64/python3.6/json/__init__.py", line 354, in loads
        return _default_decoder.decode(s)
      File "/usr/lib64/python3.6/json/decoder.py", line 339, in decode
        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
      File "/usr/lib64/python3.6/json/decoder.py", line 357, in raw_decode
        raise JSONDecodeError("Expecting value", s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
  module_stdout: ''
  msg: |-
    MODULE FAILURE
    See stdout/stderr for the exact error
  rc: 1

What happened: when I re-run the role a second time, the task “zap and destroy osds created by ceph-volume with lvm_volumes” crashes. My logical volumes were deleted when I started the playbook for the first time. the second I have a module failure error

What you expected to happen: I am waiting for the task to detect that my Logical Volumes are deleted and continue following the purge role.

How to reproduce it (minimal and precise):

With a LVM configuration in your ceph inventory, can you run the purge role several times without any problem?

Share your group_vars files, inventory and full ceph-ansibe log:

@all:
  |--@ceph_cluster:
  |  |--@cephfs:
  |  |  |--neocore3
  |  |  |--neocore4
  |  |  |--neocore5
  |  |  |--neocore6
  |  |  |--neocore7
  |  |--@clients:
  |  |  |--neocore3
  |  |--@grafana-server:
  |  |  |--neocore3
  |  |--@mdss:
  |  |  |--neocore5
  |  |--@mgrs:
  |  |  |--neocore1
  |  |  |--neocore2
  |  |  |--neocore3
  |  |--@mons:
  |  |  |--neocore1
  |  |  |--neocore2
  |  |  |--neocore3
  |  |--@osds:
  |  |  |--neocore4
  |  |  |--neocore5
  |  |  |--neocore6
  |  |  |--neocore7
  |  |--@rgws:
  |  |  |--neocore4
  |  |  |--neocore5
  |  |  |--neocore6
  |  |  |--neocore7

groups_vars:

lvm_volumes:
  - data: LV_data_OSD1
    crush_device_class: hdd
    data_vg: vg_data_HDD_1
    wal: LV_wal_OSD1
    wal_vg: vg_data_SSD_1

ceph_iscsi_config_dev: false
osd_objectstore: bluestore

Environment:

  • OS (e.g. from /etc/os-release): RHEL 8.3
  • Kernel (e.g. uname -a): 4.18.0-240.10.1.el8_3.x86_64
  • Docker version if applicable (e.g. docker version): podman version 2.0.5
  • Ansible version (e.g. ansible-playbook --version): ansible 2.9.13
  • ceph-ansible version (e.g. git head or tag or stable branch): ceph-ansible-4.0.41-1.el8cp.noarch
  • Ceph version (e.g. ceph -v): ceph version 14.2.11-95.el8cp (1d6087ae858e7c8e72fe7390c3522c7e0d951240) nautilus (stable)

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6

github_iconTop GitHub Comments

1reaction
guitscommented, Mar 22, 2021

Hi @hmeAtos ,

i’ll try to reproduce and let you know

0reactions
hmeNLEcommented, Mar 22, 2021

@hmeAtos I could reproduce the issue and found out the cause.

thinking about a patch now.

many thanks @guits, i will check your patch

Read more comments on GitHub >

github_iconTop Results From Across the Web

1656935 – ceph-ansible: purge-cluster.yml fails when initiated ...
The `ceph-ansible purge-cluster.yml` playbook no longer fails when run ... Perform purge-cluster.yml 2 times: first purge works as expected, 2nd purge fails ...
Read more >
Purging the cluster — ceph-ansible documentation
yml is to purge a containerized cluster. These playbooks aren't intended to be run with the --limit option.
Read more >
SES5.5 How to remove/replace an osd | Support - SUSE
"salt-run remove.osd" can be run multiple times if there is a failure. If the command is successful, the osd will NOT be listed...
Read more >
CSI Common Issues - Rook Ceph Documentation
The issue typically is in the Ceph cluster or network connectivity. If the issue is in Provisioning the PVC Restarting the Provisioner pods...
Read more >
Ceph common issues - Rook Docs
There are two ways to run the Ceph tools, either in the Rook toolbox or ... -n <cluster-namespace> get configmap rook-ceph-mon-endpoints -o yaml...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found