question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Segfault when executing a cached lexsort implementation

See original GitHub issue

Reporting a bug

$ pip freeze
llvmlite==0.32.1
numba==0.49.1
numpy==1.18.4

numba.literal_unroll seems like it would make a lexsort implementation easy. I had a go and it mostly worked! So thanks for numba.literal_unroll!

However, when I try to put cache=True into the generated_jit decorator, the lexsort function is successful on the initial run, but segfaults on subsequent runs:

from numba import njit, generated_jit, literal_unroll, gdb
import numba.core.types as types
import numba
import numpy as np

@njit
def cmp_fn(l, r, *arrays):
    for a in literal_unroll(arrays):
        if a[l] < a[r]:
            return -1  # less than
        elif a[l] > a[r]:
            return 1   # greater than

    return 0  # equal


@njit
def quicksort(index, L, R, *arrays):
    l, r = L, R
    pivot = index[(l + r) // 2]

    while True:
        while l < R and cmp_fn(index[l], pivot, *arrays) == -1:
            l += 1
        while r >= L and cmp_fn(pivot, index[r], *arrays) == -1:
            r -= 1

        if l >= r:
            break

        index[l], index[r] = index[r], index[l]
        l += 1
        r -= 1

        if L < r:
            quicksort(index, L, r, *arrays)

        if l < R:
            quicksort(index, l, R, *arrays)

@njit(nogil=True, cache=True, debug=True)
def lexsort(arrays):
    print("starting lexsort")

    if len(arrays) == 0:
        return np.empty((), dtype=np.intp)

    if len(arrays) == 1:
        return np.argsort(arrays[0])


    for a in literal_unroll(arrays[1:]):
        if a.shape != arrays[0].shape:
            raise ValueError("lexsort array shapes don't match")

    n = arrays[0].shape[0]
    index = np.arange(n)

    quicksort(index, 0, n - 1, *arrays)

    print("ending lexsort")
    return index

a = np.array([2, 1, 4, 3, 0], dtype=np.int32)
b = np.array([1, 2, 3, 4, 5], dtype=np.float64)

print("before lexsort")
print(lexsort((a, b)))
print("after lexsort")

I considered out of bounds access in my quicksort implementation, but that’s never reached. I tried running this through gdb, and “starting quicksort” isn’t printed.

$ gdb -ex r --args python test_literal_unroll.py 
GNU gdb (Ubuntu 8.1-0ubuntu3.2) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...Reading symbols from /usr/lib/debug/.build-id/5f/4de7b7974f514b4d5baf54bc956904a450c144.debug...done.
done.
Starting program: /home/sperkins/venv/numba/bin/python test_literal_unroll.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffee395700 (LWP 20316)]
[New Thread 0x7fffebb94700 (LWP 20317)]
[New Thread 0x7fffe9393700 (LWP 20318)]
[New Thread 0x7fffe8b92700 (LWP 20319)]
[New Thread 0x7fffe4391700 (LWP 20320)]
[New Thread 0x7fffe1b90700 (LWP 20321)]
[New Thread 0x7fffe138f700 (LWP 20322)]
[Thread 0x7fffe138f700 (LWP 20322) exited]
[Thread 0x7fffe1b90700 (LWP 20321) exited]
[Thread 0x7fffe4391700 (LWP 20320) exited]
[Thread 0x7fffe8b92700 (LWP 20319) exited]
[Thread 0x7fffe9393700 (LWP 20318) exited]
[Thread 0x7fffebb94700 (LWP 20317) exited]
[Thread 0x7fffee395700 (LWP 20316) exited]
before lexsort
starting lexsort
ending lexsort
[4 1 0 3 2]
after lexsort
[Inferior 1 (process 20309) exited normally]
(gdb) r
Starting program: /home/sperkins/venv/numba/bin/python test_literal_unroll.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffee395700 (LWP 20327)]
[New Thread 0x7fffebb94700 (LWP 20328)]
[New Thread 0x7fffeb393700 (LWP 20329)]
[New Thread 0x7fffe8b92700 (LWP 20330)]
[New Thread 0x7fffe4391700 (LWP 20332)]
[New Thread 0x7fffe1b90700 (LWP 20333)]
[New Thread 0x7fffdf38f700 (LWP 20334)]
[Thread 0x7fffdf38f700 (LWP 20334) exited]
[Thread 0x7fffe1b90700 (LWP 20333) exited]
[Thread 0x7fffe4391700 (LWP 20332) exited]
[Thread 0x7fffe8b92700 (LWP 20330) exited]
[Thread 0x7fffeb393700 (LWP 20329) exited]
[Thread 0x7fffebb94700 (LWP 20328) exited]
[Thread 0x7fffee395700 (LWP 20327) exited]
before lexsort

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007ffff7fe8715 in cpython::__main__::lexsort$241(Tuple<Array<int, 1, C, mutable, aligned>, Array<double, 1, C, mutable, aligned> >) ()
#2  0x00007fffeb98d5d9 in call_cfunc (self=self@entry=0x7fffda0f0660, cfunc=cfunc@entry=0x7fffd9d80ee8, args=args@entry=0x7ffff7e66f28, 
    kws=kws@entry=0x0, locals=locals@entry=0x0) at numba/_dispatcher.c:353
#3  0x00007fffeb98d910 in compile_and_invoke (self=self@entry=0x7fffda0f0660, args=args@entry=0x7ffff7e66f28, kws=kws@entry=0x0, locals=locals@entry=0x0)
    at numba/_dispatcher.c:376
#4  0x00007fffeb98dace in Dispatcher_call (self=0x7fffda0f0660, args=0x7ffff7e66f28, kws=<optimised out>) at numba/_dispatcher.c:591
#5  0x00000000005a9cbc in _PyObject_FastCallDict (kwargs=<optimised out>, nargs=1, args=0xae0410, func=0x7fffda0f0660) at ../Objects/tupleobject.c:131
#6  _PyObject_FastCallKeywords () at ../Objects/abstract.c:2496
#7  0x000000000050a5c3 in call_function.lto_priv () at ../Python/ceval.c:4875
#8  0x000000000050bfb4 in _PyEval_EvalFrameDefault () at ../Python/ceval.c:3335
#9  0x0000000000507d64 in PyEval_EvalFrameEx (throwflag=0, f=0xae0288) at ../Python/ceval.c:754
#10 _PyEval_EvalCodeWithName.lto_priv.1820 () at ../Python/ceval.c:4166
#11 0x000000000050ae13 in PyEval_EvalCodeEx (closure=0x0, kwdefs=0x0, defcount=0, defs=0x0, kwcount=0, kws=0x0, argcount=0, args=0x0, 
    locals=<optimised out>, globals=<optimised out>, _co=<optimised out>) at ../Python/ceval.c:4187
#12 PyEval_EvalCode (co=<optimised out>, globals=<optimised out>, locals=<optimised out>) at ../Python/ceval.c:731
#13 0x0000000000634c82 in run_mod () at ../Python/pythonrun.c:1025
#14 0x0000000000634d37 in PyRun_FileExFlags () at ../Python/pythonrun.c:978
#15 0x00000000006384ef in PyRun_SimpleFileExFlags () at ../Python/pythonrun.c:419
#16 0x00000000006386c5 in PyRun_AnyFileExFlags () at ../Python/pythonrun.c:81
#17 0x0000000000639091 in run_file (p_cf=0x7fffffffd70c, filename=<optimised out>, fp=<optimised out>) at ../Modules/main.c:340
---Type <return> to continue, or q <return> to quit---```

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
sjperkinscommented, May 9, 2020

I think this has the potential to be a numba lexsort implementation. I note that there are quicksort factories in numba.misc.quicksort.py but they’d need updating to support the lexical tuple comparison.

0reactions
PercyLaucommented, Oct 16, 2020

I think this has the potential to be a numba lexsort implementation. I note that there are quicksort factories in numba.misc.quicksort.py but they’d need updating to support the lexical tuple comparison.

Seems works fine with Python 3.7.9 (default, Aug 31 2020, 17:10:11) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32 numba: 0.51.2 numpy: 1.19.1

However, the above lexsort is inefficient, since easily resulting in stackoverflow. Any ideas to implement tail-recursive version?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Segmentation Fault when using malloc - Stack Overflow
I'm writing a code that handles cache request made by the OS, and when I'm trying to allocate memory for my cache, I'm...
Read more >
Identify what's causing segmentation faults (segfaults)
A segmentation fault (aka segfault) is a common condition that causes programs to crash; they are often associated with a file named core...
Read more >
Reporting and analyzing crashes (segfaults)
Segfaults when running a script; Errors during Julia startup; Other generic segfaults or unreachables reached. Version/Environment info. No matter the error, we ...
Read more >
Resolving Segmentation Fault (“Core dumped”) in Ubuntu - Blog
Segmentation fault is when your system tries to access a page of memory that doesn't exist. Core dumped means when a part of...
Read more >
yum " Segmentation fault" in centos - Server Fault
You can try repairing your rpm db and re-doing the cache rm -rf /var/lib/rpm/__db. ... Run memtest, best to leave it running for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found