Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Hot-swapping disks confuses ceph-ansible

See original GitHub issue

Hi,

We have osds’ devices specified via inventory (to be /dev/sda through /dev/sdbh), and raw journal devices similarly specified for NVME cards, and raw_multi_journal set.

This works fine, until we hot-plug a disk (i.e. to replace a failed drive) - the new drive then appears as /dev/sdbk and the failed drive doesn’t exist in /dev/ any more, so ceph-ansible then fails on that host in the ‘ceph-osd : fix partitions gpt header or labels of the osd disks’ task.

Example barf:

failed: [sto-2-2] (item=[{'_ansible_parsed': True, 'stderr_lines': [], '_ansible_item_result': True, u'end': u'2017-07-26 11:36:12.019353', '_ansible_no_log': False, u'stdout': u'', u'cmd': u'parted --script /dev/sdah print > /dev/null 2>&1', u'rc': 1, 'item': [{'_ansible_parsed': True, 'stderr_lines': [], '_ansible_item_result': True, u'end': u'2017-07-26 11:31:47.372346', '_ansible_no_log': False, u'stdout': u'', u'cmd': u"readlink -f /dev/sdah | egrep '/dev/([hsv]d[a-z]{1,2}|cciss/c[0-9]d[0-9]p|nvme[0-9]n[0-9]p)[0-9]{1,2}$'", u'rc': 1, 'item': u'/dev/sdah', u'delta': u'0:00:00.004611', u'stderr': u'', u'changed': False, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u"readlink -f /dev/sdah | egrep '/dev/([hsv]d[a-z]{1,2}|cciss/c[0-9]d[0-9]p|nvme[0-9]n[0-9]p)[0-9]{1,2}$'", u'removes': None, u'creates': None, u'chdir': None}}, 'stdout_lines': [], 'failed_when_result': False, u'start': u'2017-07-26 11:31:47.367735', 'failed': False}, u'/dev/sdah'], u'delta': u'0:00:00.004145', u'stderr': u'', u'changed': False, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u'parted --script /dev/sdah print > /dev/null 2>&1', u'removes': None, u'creates': None, u'chdir': None}}, 'stdout_lines': [], 'failed_when_result': False, u'start': u'2017-07-26 11:36:12.015208', 'failed': False}, u'/dev/sdah']) => {"changed": false, "cmd": "sgdisk --zap-all --clear --mbrtogpt -- /dev/sdah || sgdisk --zap-all --clear --mbrtogpt -- /dev/sdah", "delta": "0:00:00.008660", "end": "2017-07-26 13:07:28.664214", "failed": true, "item": [{"_ansible_item_result": true, "_ansible_no_log": false, "_ansible_parsed": true, "changed": false, "cmd": "parted --script /dev/sdah print > /dev/null 2>&1", "delta": "0:00:00.004145", "end": "2017-07-26 11:36:12.019353", "failed": false, "failed_when_result": false, "invocation": {"module_args": {"_raw_params": "parted --script /dev/sdah print > /dev/null 2>&1", "_uses_shell": true, "chdir": null, "creates": null, "executable": null, "removes": null, "warn": true}}, "item": [{"_ansible_item_result": true, "_ansible_no_log": false, "_ansible_parsed": true, "changed": false, "cmd": "readlink -f /dev/sdah | egrep '/dev/([hsv]d[a-z]{1,2}|cciss/c[0-9]d[0-9]p|nvme[0-9]n[0-9]p)[0-9]{1,2}$'", "delta": "0:00:00.004611", "end": "2017-07-26 11:31:47.372346", "failed": false, "failed_when_result": false, "invocation": {"module_args": {"_raw_params": "readlink -f /dev/sdah | egrep '/dev/([hsv]d[a-z]{1,2}|cciss/c[0-9]d[0-9]p|nvme[0-9]n[0-9]p)[0-9]{1,2}$'", "_uses_shell": true, "chdir": null, "creates": null, "executable": null, "removes": null, "warn": true}}, "item": "/dev/sdah", "rc": 1, "start": "2017-07-26 11:31:47.367735", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}, "/dev/sdah"], "rc": 1, "start": "2017-07-26 11:36:12.015208", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}, "/dev/sdah"], "rc": 4, "start": "2017-07-26 13:07:28.655554", "stderr": "Problem opening /dev/sdah for reading! Error is 2.\nThe specified file does not exist!\nProblem opening '' for writing! Program will now terminate.\nWarning! MBR not overwritten! Error is 2!\nCaution! Secondary header was placed beyond the disk's limits! Moving the\nheader, but other problems may occur!\nUnable to open device '' for writing! Errno is 2! Aborting write!\nProblem opening /dev/sdah for reading! Error is 2.\nThe specified file does not exist!\nProblem opening '' for writing! Program will now terminate.\nWarning! MBR not overwritten! Error is 2!\nCaution! Secondary header was placed beyond the disk's limits! Moving the\nheader, but other problems may occur!\nUnable to open device '' for writing! Errno is 2! Aborting write!", "stderr_lines": ["Problem opening /dev/sdah for reading! Error is 2.", "The specified file does not exist!", "Problem opening '' for writing! Program will now terminate.", "Warning! MBR not overwritten! Error is 2!", "Caution! Secondary header was placed beyond the disk's limits! Moving the", "header, but other problems may occur!", "Unable to open device '' for writing! Errno is 2! Aborting write!", "Problem opening /dev/sdah for reading! Error is 2.", "The specified file does not exist!", "Problem opening '' for writing! Program will now terminate.", "Warning! MBR not overwritten! Error is 2!", "Caution! Secondary header was placed beyond the disk's limits! Moving the", "header, but other problems may occur!", "Unable to open device '' for writing! Errno is 2! Aborting write!"], "stdout": "Information: Creating fresh partition table; will override earlier problems!\nInformation: Creating fresh partition table; will override earlier problems!", "stdout_lines": ["Information: Creating fresh partition table; will override earlier problems!", "Information: Creating fresh partition table; will override earlier problems!"]}

Issue Analytics

State:
Created 6 years ago
Comments:30 (14 by maintainers)

Top GitHub Comments

1reaction

BenoitKnechtcommented, Dec 16, 2019

On most distributions, udev already generates persistent disk names under /dev/disk/by-path/.

@mcv21 Wouldn’t it solve your issue if you specified your disk paths as /dev/disk/by-path/... instead of /dev/sda? If a disk fails and you replace it, udev should create a link with the exact same name since it’ll be plugged in on the same connector.

0reactions

github-actions[bot]commented, Aug 18, 2021

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

Top Results From Across the Web

1384846 – [ceph-ansible]: can fail with "Invalid partition data!"

I was installing on systems which had previously been running RHCS 2, so the disks were already stamped with ceph and had FSID...

ceph to physical hard drive. How is this mapped? - Reddit

I am wondering how the drives underneath ceph maintain fault tolerance and how they handle a drive failure if multiple drives are used...

Red Hat supplementary style guide for product documentation

A Ceph Monitor maintains the master copy of the Red Hat Ceph Storage ... Create an Ansible inventory file that is named ......

ceph-users@ceph.io - Mailing Lists

The cluster has 42 OSD nodes and each node has 12 x 14TB disks and 2 x 3.8TB ... but it has been...

OpenShift Container Platform 4.11 release notes

Translated objects are not stored on disk, and user data is not migrated. While storage class referencing to the in-tree storage plug-in will...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Hot-swapping disks confuses ceph-ansible

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

deployment is successful but ceph osd tree shows no osds!

Luminous : nfs-ganesha-fsal : Depends: glusterfs-common (>= 3.8.8) but it is not going to be installed