question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItΒ collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Segfault while installing modules: scikit-learn, hiredis, numpy, redis, scipy, tqdm>=4.29.1

See original GitHub issue

Environment

  • pip version: pip 9.0.1 from /usr/lib/python3/dist-packages (python 3.5)
  • Python version: Python 3.5.3
  • OS: Debian GNU/Linux 9 (stretch)

Description segfault while installing modules

Expected behavior no segfault

How to Reproduce I wrote a Reproducer here: https://github.com/pippy360/python_reproducer_for_segfault

To reproduce just install the following setup file:

#!/usr/bin/env python
from setuptools import setup, find_packages

setup(
    name='lets cause a segfault',
    packages=find_packages(),
    python_requires='>=3.5',
    install_requires=[
        'scikit-learn',
        'hiredis',
        'numpy',
        'redis',
        'scipy',
        'tqdm>=4.29.1',
    ]
)

Output

user@instance-3:~/python_reproducer_for_segfault$ pip3 install .
Processing /home/user/python_reproducer_for_segfault
Collecting hiredis (from lets-cause-a-segfault==0.0.0)
  Downloading https://files.pythonhosted.org/packages/51/df/d2a08fb767247c0acea66255908e60cdeb4cd13cf71d42a1e2ca5803a1f8/hiredis-1.0.0-cp35-cp35m-manylinux1_x86_64.whl (49kB)
    100% |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 51kB 1.7MB/s 
Collecting numpy (from lets-cause-a-segfault==0.0.0)
  Downloading https://files.pythonhosted.org/packages/ad/15/690c13ae714e156491392cdbdbf41b485d23c285aa698239a67f7cfc9e0a/numpy-1.16.1-cp35-cp35m-manylinux1_x86_64.whl (17.2MB)
    100% |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 17.2MB 82kB/s 
Collecting redis (from lets-cause-a-segfault==0.0.0)
  Downloading https://files.pythonhosted.org/packages/f1/19/a0282b77c23f9f9dbcc6480787a60807c78a45947593a02dbf026636c90d/redis-3.1.0-py2.py3-none-any.whl (63kB)
    100% |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 71kB 11.3MB/s 
Collecting scikit-learn (from lets-cause-a-segfault==0.0.0)
  Downloading https://files.pythonhosted.org/packages/18/d9/bea927c86bf78d583d517f24cbc87606cb333bfb3a5c99cb85b547305f0f/scikit_learn-0.20.2-cp35-cp35m-manylinux1_x86_64.whl (5.3MB)
    100% |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5.3MB 279kB/s 
Collecting scipy (from lets-cause-a-segfault==0.0.0)
  Downloading https://files.pythonhosted.org/packages/ab/19/c0ad5b9183ef97030edd6297d1726525ff2c369a09fbb6ea52a1e616ffd6/scipy-1.2.0-cp35-cp35m-manylinux1_x86_64.whl (26.5MB)
    100% |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 26.5MB 48kB/s 
Collecting tqdm>=4.29.1 (from lets-cause-a-segfault==0.0.0)
  Downloading https://files.pythonhosted.org/packages/76/4c/103a4d3415dafc1ddfe6a6624333971756e2d3dd8c6dc0f520152855f040/tqdm-4.30.0-py2.py3-none-any.whl (47kB)
    100% |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 51kB 10.9MB/s 
Installing collected packages: hiredis, numpy, redis, scipy, scikit-learn, tqdm, lets-cause-a-segfault
  Running setup.py install for lets-cause-a-segfault ... done
Successfully installed hiredis-1.0.0 lets-cause-a-segfault-0.0.0 numpy-1.16.1 redis-3.1.0 scikit-learn-0.20.2 scipy-1.2.0 tqdm-4.30.0
user@instance-3:~/python_reproducer_for_segfault$ ^C
user@instance-3:~/python_reproducer_for_segfault$ pip3 install .
Processing /home/user/python_reproducer_for_segfault
Collecting hiredis (from lets-cause-a-segfault==0.0.0)
  Using cached https://files.pythonhosted.org/packages/51/df/d2a08fb767247c0acea66255908e60cdeb4cd13cf71d42a1e2ca5803a1f8/hiredis-1.0.0-cp35-cp35m-manylinux1_x86_64.whl
Collecting numpy (from lets-cause-a-segfault==0.0.0)
  Using cached https://files.pythonhosted.org/packages/ad/15/690c13ae714e156491392cdbdbf41b485d23c285aa698239a67f7cfc9e0a/numpy-1.16.1-cp35-cp35m-manylinux1_x86_64.whl
Collecting redis (from lets-cause-a-segfault==0.0.0)
  Using cached https://files.pythonhosted.org/packages/f1/19/a0282b77c23f9f9dbcc6480787a60807c78a45947593a02dbf026636c90d/redis-3.1.0-py2.py3-none-any.whl
Collecting scikit-learn (from lets-cause-a-segfault==0.0.0)
  Using cached https://files.pythonhosted.org/packages/18/d9/bea927c86bf78d583d517f24cbc87606cb333bfb3a5c99cb85b547305f0f/scikit_learn-0.20.2-cp35-cp35m-manylinux1_x86_64.whl
Collecting scipy (from lets-cause-a-segfault==0.0.0)
  Using cached https://files.pythonhosted.org/packages/ab/19/c0ad5b9183ef97030edd6297d1726525ff2c369a09fbb6ea52a1e616ffd6/scipy-1.2.0-cp35-cp35m-manylinux1_x86_64.whl
Collecting tqdm>=4.29.1 (from lets-cause-a-segfault==0.0.0)
  Using cached https://files.pythonhosted.org/packages/76/4c/103a4d3415dafc1ddfe6a6624333971756e2d3dd8c6dc0f520152855f040/tqdm-4.30.0-py2.py3-none-any.whl
Installing collected packages: hiredis, numpy, redis, scipy, scikit-learn, tqdm, lets-cause-a-segfault
  Running setup.py install for lets-cause-a-segfault ... done
Successfully installed hiredis-1.0.0 lets-cause-a-segfault-0.0.0 numpy-1.16.1 redis-3.1.0 scikit-learn-0.20.2 scipy-1.2.0 tqdm-4.30.0
Segmentation fault
user@instance-3:~/python_reproducer_for_segfault$ 

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:13 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
chrahuntcommented, Oct 9, 2019

I was able to reproduce this.

The segfault occurs at Python interpreter shutdown, when trying to clean up the Reader member of the hiredis module - the type associated with this object is null, so attempting to dereference it causes a segfault.

backtrace
(gdb) p $_siginfo
$1 = {si_signo = 11, si_errno = 0, si_code = 1, _sifields = {_pad = {169, 0, -364572592,
      21963, -1818589628, 32764, -1528607782, 32732, 1, 0, 160, 0, -431765888, 21963,
      -367054240, 21963, 44, 0, -436689679, 21963, -367005680, 21963, -367054240, 21963,
      -367005680, 21963, -438187525, 21963}, _kill = {si_pid = 169, si_uid = 0}, _timer = {
      si_tid = 169, si_overrun = 0, si_sigval = {sival_int = -364572592,
        sival_ptr = 0x55cbea451050}}, _rt = {si_pid = 169, si_uid = 0, si_sigval = {
        sival_int = -364572592, sival_ptr = 0x55cbea451050}}, _sigchld = {si_pid = 169,
      si_uid = 0, si_status = -364572592, si_utime = -7810782977104783925,
      si_stime = -6565320432101064708}, _sigfault = {si_addr = 0xa9,
      _addr_lsb = 94334297116752, _addr_bnd = {_lower = 0x7ffc939a8644,
        _upper = 0x7fdca4e34bda}}, _sigpoll = {si_band = 169, si_fd = -364572592}}}
(gdb) bt
#0  0x00005555556aa054 in visit_decref () at ../Modules/gcmodule.c:373
#1  0x00005555557443c5 in dict_traverse.lto_priv () at ../Objects/dictobject.c:2570
#2  0x00005555556ae773 in subtract_refs () at ../Modules/gcmodule.c:398
#3  collect () at ../Modules/gcmodule.c:951
#4  0x00005555557a591d in collect_with_callback () at ../Modules/gcmodule.c:1119
#5  0x00005555557a5981 in PyGC_Collect () at ../Modules/gcmodule.c:1583
#6  0x00005555557ab1ac in Py_Finalize () at ../Python/pylifecycle.c:567
#7  0x00005555557ab2a8 in Py_Exit (sts=sts@entry=0) at ../Python/pylifecycle.c:1465
#8  0x00005555557ab38e in handle_system_exit () at ../Python/pythonrun.c:602
#9  0x00005555557ab3f6 in PyErr_PrintEx () at ../Python/pythonrun.c:612
#10 0x00005555557d8c7a in RunModule () at ../Modules/main.c:210
#11 0x00005555557d952f in Py_Main () at ../Modules/main.c:709
#12 0x0000555555668d71 in main () at ../Programs/python.c:65
#13 0x00007ffff6cee2e1 in __libc_start_main (main=0x555555668c90 <main>, argc=5,
    argv=0x7fffffffe648, init=<optimized out>, fini=<optimized out>,
    rtld_fini=<optimized out>, stack_end=0x7fffffffe638) at ../csu/libc-start.c:291
#14 0x000055555576fa7a in _start ()

If we look at the op in dict_traverse we see that it’s the module dict for hiredis

(gdb) up
#1  0x00005555557443c5 in dict_traverse.lto_priv () at ../Objects/dictobject.c:2570
2570    ../Objects/dictobject.c: No such file or directory.
(gdb) p op
$2 = {'ProtocolError': <type at remote 0x555555d40778>, 'ReplyError': <type at remote 0x555555d40c18>, '__spec__': <ModuleSpec(submodule_search_locations=['/home/chris/.local/lib/python3.5/site-packages/hiredis'], name='hiredis', origin='/home/chris/.local/lib/python3.5/site-packages/hiredis/__init__.py', _set_fileattr=True, _cached='/home/chris/.local/lib/python3.5/site-packages/hiredis/__pycache__/__init__.cpython-35.pyc', loader=<SourceFileLoader(path='/home/chris/.local/lib/python3.5/site-packages/hiredis/__init__.py', name='hiredis') at remote 0x7ffff6aa17b8>, _initializing=False, loader_state=None) at remote 0x7ffff6aa1860>, 'version': <module at remote 0x7ffff6a9de08>, 'hiredis': <module at remote 0x7ffff6a9dc78>, '__doc__': None, '__cached__': '/home/chris/.local/lib/python3.5/site-packages/hiredis/__pycache__/__init__.cpython-35.pyc', '__version__': '1.0.0', '__loader__': <...>, 'HiredisError': <type at remote 0x555555d18448>, '__package__': 'hiredis', '__file__': '/home/chris/.local/lib/python3.5/site-pa...(truncated)

And inspection of the Reader member shows that ob_type is null, but this is dereferenced in visit_decref as part of PyObject_IS_GC, leading to segfault.

(gdb) p mp->ma_keys->dk_entries[31]
$34 = {me_hash = -3403915311998756673, me_key = 'Reader',
  me_value = <unknown at remote 0x7f2a422a4940>}
(gdb) p mp->ma_keys->dk_entries[31].me_value.ob_refcnt
$36 = 1
(gdb) p mp->ma_keys->dk_entries[31].me_value.ob_type
$37 = (struct _typeobject *) 0x0

In pip 9.0.1 (the version installed by python3-pip in Debian 9) we were using CacheControl 0.11.7.

In CacheControl 0.11.7, cachecontrol.caches was importing redis.

redis imports hiredis.

For some reason importing hiredis before the processing we do to install hiredis causes the condition that leads to the segfault mentioned above. This aligns with the observation that the crash happens on the second and subsequent installs.

I’ve simplified this a bit by doing the following:

  1. Installing hiredis alone - still segfaults
  2. Installing hiredis from a pre-downloaded wheel - still segfaults
  3. Removing the import from the CacheControl wheel and moving it into pip directly - still segfaults
  4. Importing hiredis instead of redis - still segfaults
  5. Moving hiredis.cpython-35m-x86_64-linux-gnu.so to hiredis.so, so that the file that gets written on install is different than the one that was imported - does not segfault

I haven’t been able to reproduce this with the latest pip yet, even explicitly importing hiredis in pip itself and running with --user --ignore-installed in order to get similar behavior to 9.0.1.

1reaction
pippy360commented, Oct 8, 2019

I have responded to your email with the ssh details for a vm

Read more comments on GitHub >

github_iconTop Results From Across the Web

No results found

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found