Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ceph_purge-container-cluster.yml cannot be run multiple times.

See original GitHub issue

Hello,

In my configuration I have set up Logical Volume. When I deploy this it works fine. the problem is during the purge, I cannot run the purge role (ceph_purge-container-cluster.yml) more than once. I have the impression that this role is not idempotent. I would like to know if I am the only one having this problem?

Thank you for your help. regards

Bug Report

TASK [zap and destroy osds created by ceph-volume with lvm_volumes] ***************************************************************************************************************
Friday 12 February 2021  14:54:08 +0100 (0:00:00.136)       0:10:26.168 *******
The full traceback is:
Traceback (most recent call last):
  File "<stdin>", line 102, in <module>
  File "<stdin>", line 94, in _ansiballz_main
  File "<stdin>", line 40, in invoke_module
  File "/usr/lib64/python3.6/runpy.py", line 205, in run_module
    return _run_module_code(code, init_globals, run_name, mod_spec)
  File "/usr/lib64/python3.6/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 759, in <module>
  File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 755, in main
  File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 633, in run_module
  File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 475, in is_lv
  File "/usr/lib64/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
failed: [neocore6] (item={'data': 'LV_data_OSD1', 'crush_device_class': 'hdd', 'data_vg': 'vg_data_HDD_1', 'wal': 'LV_wal_OSD1', 'wal_vg': 'vg_data_SSD_1'}) => changed=false
  ansible_loop_var: item
  item:
    crush_device_class: hdd
    data: LV_data_OSD1
    data_vg: vg_data_HDD_1
    wal: LV_wal_OSD1
    wal_vg: vg_data_SSD_1
  module_stderr: |-
    Traceback (most recent call last):
      File "<stdin>", line 102, in <module>
      File "<stdin>", line 94, in _ansiballz_main
      File "<stdin>", line 40, in invoke_module
      File "/usr/lib64/python3.6/runpy.py", line 205, in run_module
        return _run_module_code(code, init_globals, run_name, mod_spec)
      File "/usr/lib64/python3.6/runpy.py", line 96, in _run_module_code
        mod_name, mod_spec, pkg_name, script_name)
      File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 759, in <module>
      File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 755, in main
      File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 633, in run_module
      File "/tmp/ansible_ceph_volume_payload_462rwwao/ansible_ceph_volume_payload.zip/ansible/modules/ceph_volume.py", line 475, in is_lv
      File "/usr/lib64/python3.6/json/__init__.py", line 354, in loads
        return _default_decoder.decode(s)
      File "/usr/lib64/python3.6/json/decoder.py", line 339, in decode
        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
      File "/usr/lib64/python3.6/json/decoder.py", line 357, in raw_decode
        raise JSONDecodeError("Expecting value", s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
  module_stdout: ''
  msg: |-
    MODULE FAILURE
    See stdout/stderr for the exact error
  rc: 1

What happened: when I re-run the role a second time, the task “zap and destroy osds created by ceph-volume with lvm_volumes” crashes. My logical volumes were deleted when I started the playbook for the first time. the second I have a module failure error

What you expected to happen: I am waiting for the task to detect that my Logical Volumes are deleted and continue following the purge role.

How to reproduce it (minimal and precise):

With a LVM configuration in your ceph inventory, can you run the purge role several times without any problem?

Share your group_vars files, inventory and full ceph-ansibe log:

@all:
  |--@ceph_cluster:
  |  |--@cephfs:
  |  |  |--neocore3
  |  |  |--neocore4
  |  |  |--neocore5
  |  |  |--neocore6
  |  |  |--neocore7
  |  |--@clients:
  |  |  |--neocore3
  |  |--@grafana-server:
  |  |  |--neocore3
  |  |--@mdss:
  |  |  |--neocore5
  |  |--@mgrs:
  |  |  |--neocore1
  |  |  |--neocore2
  |  |  |--neocore3
  |  |--@mons:
  |  |  |--neocore1
  |  |  |--neocore2
  |  |  |--neocore3
  |  |--@osds:
  |  |  |--neocore4
  |  |  |--neocore5
  |  |  |--neocore6
  |  |  |--neocore7
  |  |--@rgws:
  |  |  |--neocore4
  |  |  |--neocore5
  |  |  |--neocore6
  |  |  |--neocore7

groups_vars:

lvm_volumes:
  - data: LV_data_OSD1
    crush_device_class: hdd
    data_vg: vg_data_HDD_1
    wal: LV_wal_OSD1
    wal_vg: vg_data_SSD_1

ceph_iscsi_config_dev: false
osd_objectstore: bluestore

Environment:

OS (e.g. from /etc/os-release): RHEL 8.3
Kernel (e.g. uname -a): 4.18.0-240.10.1.el8_3.x86_64
Docker version if applicable (e.g. docker version): podman version 2.0.5
Ansible version (e.g. ansible-playbook --version): ansible 2.9.13
ceph-ansible version (e.g. git head or tag or stable branch): ceph-ansible-4.0.41-1.el8cp.noarch
Ceph version (e.g. ceph -v): ceph version 14.2.11-95.el8cp (1d6087ae858e7c8e72fe7390c3522c7e0d951240) nautilus (stable)

Issue Analytics

State:
Created 3 years ago
Comments:6

Top GitHub Comments

1reaction

guitscommented, Mar 22, 2021

Hi @hmeAtos ,

i’ll try to reproduce and let you know

0reactions

hmeNLEcommented, Mar 22, 2021

@hmeAtos I could reproduce the issue and found out the cause.

thinking about a patch now.

many thanks @guits, i will check your patch

Top Results From Across the Web

1656935 – ceph-ansible: purge-cluster.yml fails when initiated ...

The `ceph-ansible purge-cluster.yml` playbook no longer fails when run ... Perform purge-cluster.yml 2 times: first purge works as expected, 2nd purge fails ...

Purging the cluster — ceph-ansible documentation

yml is to purge a containerized cluster. These playbooks aren't intended to be run with the --limit option.

SES5.5 How to remove/replace an osd | Support - SUSE

"salt-run remove.osd" can be run multiple times if there is a failure. If the command is successful, the osd will NOT be listed...

CSI Common Issues - Rook Ceph Documentation

The issue typically is in the Ceph cluster or network connectivity. If the issue is in Provisioning the PVC Restarting the Provisioner pods...

Ceph common issues - Rook Docs

There are two ways to run the Ceph tools, either in the Rook toolbox or ... -n <cluster-namespace> get configmap rook-ceph-mon-endpoints -o yaml...