question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

nfs-ganesha crashed with undefined symbol: ceph_start_reclaim

See original GitHub issue

Bug Report

What happened: I don’t know if this is the right place to report this. If not could you please direct me to the right place?

The playbook crashes at task TASK [ceph-nfs : start nfs gateway service] because of a failure in nfs-ganesha. However if I install nfs-ganesha from https://launchpad.net/~nfs-ganesha it’s working.

I thought maybe I could set nfs_ganesha_stable_deb_repo: [trusted=yes] http://ppa.launchpad.net/nfs-ganesha/nfs-ganesha-2.7/ubuntu but that’s not working, because then https://launchpad.net/~nfs-ganesha/+archive/ubuntu/libntirpc-1.7 is missing. However it’s not possible to add both of these repos for nfs-ganesha with the current set of variables, or am I wrong? Maybe we could use a list in here (https://github.com/ceph/ceph-ansible/blob/stable-3.2/roles/ceph-nfs/tasks/pre_requisite_non_container.yml#L42) instead of a single variable?

A dirty quick hack (adding a nfs_ganesha_libntirpc_stable_deb_repo variable) that just works could look like this:

diff --git a/roles/ceph-nfs/tasks/pre_requisite_non_container.yml b/roles/ceph-nfs/tasks/pre_requisite_non_container.yml
index aa0122aa..7293b792 100644
--- a/roles/ceph-nfs/tasks/pre_requisite_non_container.yml
+++ b/roles/ceph-nfs/tasks/pre_requisite_non_container.yml
@@ -37,6 +37,33 @@
     - ceph_origin == 'repository'
     - ceph_repository == 'dev'
 
+- name: add apt key for nfs-ganesha stable repository
+  apt_key:
+    keyserver: keyserver.ubuntu.com
+    id: EA914D611053D07BD332E18010353E8834DC57CA
+    state: present
+  register: add_ganesha_apt_key
+  retries: 5
+  delay: 10
+  until: add_ganesha_apt_key.failed == False
+  when:
+    - ansible_os_family == 'Debian'
+    - nfs_ganesha_stable
+    - ceph_origin == 'repository'
+    - ceph_repository == 'community'
+
+- name: add libntirpc stable repository
+  apt_repository:
+    repo: "deb {{ nfs_ganesha_libntirpc_stable_deb_repo }} {{ ceph_stable_distro_source | default(ansible_lsb.codename) }} main"
+    state: present
+    update_cache: no
+  register: add_ganesha_apt_repo
+  when:
+    - ansible_os_family == 'Debian'
+    - nfs_ganesha_stable
+    - ceph_origin == 'repository'
+    - ceph_repository == 'community'
+
 - name: add nfs-ganesha stable repository
   apt_repository:
     repo: "deb {{ nfs_ganesha_stable_deb_repo }} {{ ceph_stable_distro_source | default(ansible_lsb.codename) }} main"

Of course it would be better to have the actual nfs-ganesha problem fixed I think. However here are the error messages.

TASK [ceph-nfs : start nfs gateway service] ************************************
  master: fatal: [192.168.90.2]: FAILED! => {"changed": false, "msg": "Unable to start service nfs-ganesha: Job for nfs-ganesha.service failed because the control process exited with error code.\nSee \"systemctl status nfs-ganesha.service\" and \"journalctl -xe\" for details.\n"}
$ cat /var/log/ganesha/ganesha.log
11/01/2019 15:04:00 : epoch 5c38a250 : master : ganesha.nfsd-27627[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 2.8-dev.1
11/01/2019 15:04:00 : epoch 5c38a250 : master : ganesha.nfsd-27632[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed
11/01/2019 15:04:00 : epoch 5c38a250 : master : ganesha.nfsd-27632[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper.
11/01/2019 15:04:00 : epoch 5c38a250 : master : ganesha.nfsd-27632[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized.
11/01/2019 15:04:00 : epoch 5c38a250 : master : ganesha.nfsd-27632[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
11/01/2019 15:04:00 : epoch 5c38a250 : master : ganesha.nfsd-27632[main] load_fsal :NFS STARTUP :FATAL :Could not dlopen module: /usr/lib/x86_64-linux-gnu/ganesha/libfsalceph.so Error: /usr/lib/x86_64-linux-gnu/ganesha/libfsalceph.so: undefined symbol: ceph_start_reclaim. You might want to install the nfs-ganesha-CEPH package

What you expected to happen: nfs-ganesha should start without an error.

How to reproduce it (minimal and precise):

  • git checkout stable-3.2
  • use the the group_vars/all config provided below
  • cp site.yml.sample site.yml
  • install requirements pip install -r requirements.txt
  • start vagrant vagrant up

Share your group_vars files, inventory group_vars/all

ceph_origin: repository
ceph_repository: dev
ceph_dev_branch: mimic
public_network: 192.168.90.0/24
cluster_network: 192.168.90.0/24
monitor_interface: eth1
lvm_volumes:
  - data: /dev/sdb
  - data: /dev/sdc
osd_scenario: lvm
nfs_ganesha_dev: true
nfs_ganesha_flavor: ceph_mimic
nfs_file_gw: true
nfs_obj_gw: false

Note that I only used dev repo, because there is no package for bionic in http://download.ceph.com/nfs-ganesha/deb-V2.7-stable/mimic/dists/

Environment:

  • OS (e.g. from /etc/os-release): Ubuntu 1804 (Vagrant box generic/ubuntu1804)
  • Kernel (e.g. uname -a): Linux master 4.15.0-39-generic #42-Ubuntu SMP Tue Oct 23 15:48:01 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Docker version if applicable (e.g. docker version):
  • Ansible version (e.g. ansible-playbook --version): ansible 2.6.11
  • ceph-ansible version (e.g. git head or tag or stable branch): stable-3.2 (commit 4e94d11aa738a0f5de3e84dc70d534fbe9c34c2f)
  • Ceph version (e.g. ceph -v): ceph version 13.2.4-88-g30fcb44 (30fcb440bcf4f65ce0ca5ef7d9c4f3a6cc9342bb) mimic (stable)

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
ajarrcommented, Jan 18, 2019

@Bruceforce , true. I just noticed the change in directory structure too. I will talk to the package maintainer and check with them on that, and see if the ceph_ansible code needs to be changed.

0reactions
Bruceforcecommented, Jan 21, 2019

Closing and continue in #3520

Read more comments on GitHub >

github_iconTop Results From Across the Web

1418201 – nfs-ganesha crashed when same file is copied ...
Bug 1418201 - nfs-ganesha crashed when same file is copied from 2 different clients to the same mountpoint. Summary: nfs-ganesha crashed when same...
Read more >
subject:"\[ceph\-users\] NFS" - The Mail Archive
For Ceph, Ganesha has 2 FSALs: FSAL_CEPH works on top of CephFS ... Re: [ceph-users] [Nfs-ganesha-devel] 2.7.3 with CEPH_FSAL Crashing.
Read more >
Pacific - Ceph Documentation
Clusters deployed by cephadm can support an NFS export of both rgw and cephfs from a single NFS cluster instance. The nfs cluster...
Read more >
SUSE-CU-2021:602-1: Security update of ses/7/ceph/ceph
Avoid error message when udev is updated due to udev being already active when the sockets are started again (bsc#1188291). - Allow the...
Read more >
Red Hat Ceph Storage 3.0 security and bug fix update - Vulners
Bug Fix(es): * Previously, Ceph RADOS Gateway (RGW) instances in zones configured for multi-site replication would crash if configured to ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found