Seemingly random segfault on macOS if function is in larger library
See original GitHub issueReporting a bug
- I am using the latest released version of Numba (most recent is visible in the change log (https://github.com/numba/numba/blob/master/CHANGE_LOG).
- [ x] I have included below a minimal working reproducer (if you are unsure how to write one see http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports).
Hi,
first of all, sorry for this small report and few examples, but at this point this seems to untraceable to me that im hoping for any input to trace down the issue. Maybe its even a severely stupid mistake by myself that I just can’t find. I was getting segfaults from a Numba function and traced it down to the state I will outline here, but at this point I can’t find anything anymore.
I have a function which operates on arrays. I have simplified it very far so I know its not making so much sense - but here it goes.
import numpy as np
import numba
@numba.jit("f8[:,:](f8[:,:,:],f8[:,:],f8,f8,f8[:])", nopython=True, parallel=True,
nogil=True)
def evalManyIndividual(individuals, X,p1, p2, p3):
fitnesses = np.zeros((individuals.shape[0], 4))
nF = individuals.shape[0]
for i in numba.prange(nF):
individual = individuals[i]
P = np.random.random((3,4))
fitnesses[i] = np.random.random((4,))
return fitnesses
For debugging, Im using synthetic input data
n = 15
m = 3
inputInd = np.random.random((500, n, m))
inputArray = np.random.random((n, m))
p1 = 25e-3
p2 = 55.
p3 = np.array([320., 240.])
ret = evalManyIndividualQ3D(inputInd, inputArray, p1, p2, p3)
Running this as a small script works. Running this from interactive works. However, I have a large library with Numba functions in which the one above is included. Just somewhere in there. Same syntax, copy & paste. If I then add the execution with the same synthetic input data after the library (compiling the full library including the above function), only calling the function as above
ret = evalManyIndividualQ3D(inputInd, inputArray, p1, p2, p3)
Im getting a segfault. No traceback, no nothing. In terminal its zsh: segmentation fault
, Jupyter just hangs completely.
This happens on macOS 10.15. A difference to mention would be that during compilation of the library, Im getting some warnings
NumbaPerformanceWarning: '@' is faster on contiguous arrays, called on (array(float64, 2d, A), array(float64, 2d, A))
warnings.warn(NumbaPerformanceWarning(msg))
Those products are not in the function or connected to the function that crashes!
At this point, Im happy for any type of input since I can’t find a reason.
Issue Analytics
- State:
- Created 3 years ago
- Comments:58 (31 by maintainers)
I think the problem here is as follows… using this code as an example:
When this script is run the following sequence occurs:
import numba
, via it’s__init__
, https://github.com/numba/numba/blob/b4badb5f0ecae44ce3fbc57d83a85d24488699a3/numba/__init__.py#L38-L39 importsvectorize
which goes vianumba.np.ufunc.__init__
https://github.com/numba/numba/blob/b4badb5f0ecae44ce3fbc57d83a85d24488699a3/numba/np/ufunc/__init__.py#L3 which has the side effect of this import too: https://github.com/numba/numba/blob/b4badb5f0ecae44ce3fbc57d83a85d24488699a3/numba/np/ufunc/__init__.py#L6-L7numba.np.ufunc.parallel
being imported as part ofnumba.__init__
this module global is evaluated: https://github.com/numba/numba/blob/b4badb5f0ecae44ce3fbc57d83a85d24488699a3/numba/np/ufunc/parallel.py#L47 the result being thatNUM_THREADS
is e.g. 4 for a 4 core machine.if __name__ == "__main__"
, this making a call first to setnumba.config.NUMBA_NUM_THREADS=2
and then to call the@numba.njit(parallel=True, debug=True)
decorated functionf
.f
,_launch_threads
is called to start the actual thread pool, this is done here https://github.com/numba/numba/blob/b4badb5f0ecae44ce3fbc57d83a85d24488699a3/numba/parfors/parfor_lowering.py#L1419 and is set with the valueNUM_THREADS
from above here: https://github.com/numba/numba/blob/b4badb5f0ecae44ce3fbc57d83a85d24488699a3/numba/np/ufunc/parallel.py#L493 as a result there’s a threadpool of size e.g. 4. then_load_num_threads_funcs()
is called here: https://github.com/numba/numba/blob/b4badb5f0ecae44ce3fbc57d83a85d24488699a3/numba/np/ufunc/parallel.py#L495 which then calls the backend specific_set_num_threads
function such that the main thread hasNUM_THREADS
as the number of threads in the pool in its TLS slot, here: https://github.com/numba/numba/blob/b4badb5f0ecae44ce3fbc57d83a85d24488699a3/numba/np/ufunc/parallel.py#L511 and here (for OpenMP): https://github.com/numba/numba/blob/b4badb5f0ecae44ce3fbc57d83a85d24488699a3/numba/np/ufunc/omppool.cpp#L59-L63f
, the parfors lowering queries the python functionnumba.np.ufunc.parallel.get_thread_count
from here https://github.com/numba/numba/blob/b4badb5f0ecae44ce3fbc57d83a85d24488699a3/numba/parfors/parfor_lowering.py#L1503 and this function in turn looks like: https://github.com/numba/numba/blob/b4badb5f0ecae44ce3fbc57d83a85d24488699a3/numba/np/ufunc/parallel.py#L37-L44 which results in thesched_size
ending up based on the value 2 as it’s read from thenumba.config
variable. However, later, when the memory allocated to thesched_size
size is used at run time in a call todo_scheduling
: https://github.com/numba/numba/blob/b4badb5f0ecae44ce3fbc57d83a85d24488699a3/numba/parfors/parfor_lowering.py#L1530-L1535 the number of threads used also comes from a call made at runtime from here: https://github.com/numba/numba/blob/b4badb5f0ecae44ce3fbc57d83a85d24488699a3/numba/parfors/parfor_lowering.py#L1521 the value of which at runtime is e.g. 4 as it’s from the TLS slot in the threading backend, e.g. for OpenMP: https://github.com/numba/numba/blob/b4badb5f0ecae44ce3fbc57d83a85d24488699a3/numba/np/ufunc/omppool.cpp#L65-L76Enough of the keywords in this issue line up with things going on in my debugging that I thought I’d chime in (and watch for further updates). I have been using
@njit(parallel=True, cache=True)
on a function and the test case intermittently fails by hanging the pytest process so I can’t interrupt it and have to kill it. Triggering recompilation of thenjit
ed function returns it to working.I’ve been unable to track down the root cause but @stuartarchibald’s analysis above was comprehensive (seriously impressive!) and it seems likely the “wrong” value is getting baked in somewhere in my case. As a workaround I’m not using
cache=True
on those functions and I’m no longer calling numba.set_num_threads at all.Is #6025 the best hope for resolving this on macOS?