question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

''osd_memory_target'' in section ''osd'' redefined Error during FSID fetch

See original GitHub issue

Bug Report

What happened: When running a rolling update, ceph-facts : get current fsid task fails as the executed command returns a non-zero return code, with the stderr containing a warning that the osd_memory_target setting has been redefined.

stderr: 'Can''t get admin socket path: unable to get conf option admin_socket for mon.host: warning: line 17: ''osd_memory_target'' in section ''osd'' redefined '

I suspect may be due to the bluestore_cache_autotune setting in Ceph. ceph-ansible sets the osd memory target setting (with spaces rather than understores) here. However, ceph itself appears to set the osd_memory_target setting instead if bluestore_cache_autotune is enabled, which is the default setting (ref). As a result, the same variable is defined twice - once with spaces, once with underscores. This does not appear to affect regular cluster functionality, however it does cause some commands (such as the one to retrieve the fsid) to output a warning to stderr and a return code of 22.

What you expected to happen: The rolling update playbook to function as normal, including correct retrieval of the FSID.

How to reproduce it (minimal and precise): Create a ceph.conf file containing a single setting defined both using underscores and spaces, for example;

[osd]
bluestore compression algorithm = lz4
bluestore compression mode = aggressive
osd max backfills = 8
osd memory target = 4294967296
osd recovery max active = 8

osd_memory_target = 4242538496
osd_memory_base = 2147483648
osd_memory_cache_min = 3195011072

Run the rolling update playbook. The ceph-facts : get current fsid task will fail with the task’s stderr containing the message Can''t get admin socket path: unable to get conf option admin_socket for mon.host: warning: line 17: ''osd_memory_target'' in section ''osd'' redefined.

Remove one of the identical configuration entries and re-run the playbook. The task will no longer fail, however the value will be silently added back.

Environment:

  • OS (e.g. from /etc/os-release): Ubuntu 18.04.3
  • Kernel (e.g. uname -a): 5.0.0-29-generic #31~18.04.1-Ubuntu SMP Thu Sep 12 18:29:21 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Docker version if applicable (e.g. docker version): 18.09.7
  • Ansible version (e.g. ansible-playbook --version): 2.8.5
  • ceph-ansible version (e.g. git head or tag or stable branch): v4.0.0rc16
  • Ceph version (e.g. ceph -v): 14.2.2/14.2.3

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
dsavineaucommented, Oct 1, 2019

So I reproduced the issue and probably found a workaround.

# ceph daemon mon.cephaio-1 config get fsid
Can't get admin socket path: unable to get conf option admin_socket for mon.cephaio-1: warning: line 13: 'osd_memory_target' in section 'osd' redefined

This ceph daemon command tries to resolve the sock path by running a ceph-conf command [1] and as you mentioned this command raises an exception. That’s weird that the command is executed properly with rc 0 but the warning message is present in the stdout so maybe the python stderr/stdout handling is wrong somewhere

# ceph-conf --name mon.cephaio-1 --show-config-value admin_socket
warning: line 13: 'osd_memory_target' in section 'osd' redefined 
/var/run/ceph/ceph-mon.cephaio-1.asok
# echo $?
0

The ceph daemon command could be change by using the --admin-daemon option to avoid to resolve the socket path from the config [2] (even with the duplicate entries in the config)

# ceph --admin-daemon=/var/run/ceph/ceph-mon.cephaio-1.asok config get fsid
{
    "fsid": "5deaf2cd-a9c3-436b-9611-4a6a86719aa0"
}

[1] https://github.com/ceph/ceph/blob/nautilus/src/ceph.in#L749-L750 [2] https://github.com/ceph/ceph/blob/nautilus/src/ceph.in#L736-L759

0reactions
KingJcommented, Oct 2, 2019

Thanks for the fast turnaround on the fix!

Read more comments on GitHub >

github_iconTop Results From Across the Web

1700027 – Mimic: auto tuned osd-memory-target causes ...
Description of problem: In testing with RHOCS and the fio workload, we got Ceph into a state where one of the OSDs reported...
Read more >
Bug #44010: changing osd_memory_target currently ...
After waiting a few hours, I restarted this OSD and memory usage went up as expected. I tried both, up- and down-sizing of...
Read more >
Administration Guide | SUSE Enterprise Storage 5.5 (SES ...
To update the grains manually, run salt target osd.retain . It is part of DeepSea Stage 3, therefore if you are going to...
Read more >
Replacing OSD disks - Ceph
Identify the storage node and the target disk. This section shows commands to run on a ceph-osd unit or a ceph-mon unit that...
Read more >
osd(s) with unlimited ram growth -
increased osd memory consumption up to 8GB, with default osd_memory target of 4GB; individual hosts that wanted to consume so much main memory...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found