Conditional `with gil` and threading
See original GitHub issueA nogil block with a loop that uses an unlikely with gil
statement e.g. approximately,
cpdef float test_gil(float[:] x):
cdef int i = 0
cdef int size = x1.shape[0]
cdef float out = 0
with nogil:
while i < size:
out_sum += x[i]
if i > size +1:
with gil:
raise ValueError
return out
appears work similarly to the while function not releasing gil, when using in multi-threaded code.
See more complete minimal example in https://github.com/scikit-learn/scikit-learn/pull/17038#issuecomment-619476846
Naively I though that since the with gil
is never executed, performance wise this would be equivalent to the same function without the with gil
block (or roughly that threading performance would be directly impacted by the fraction of run time where the gil is released). However it does not seem to be the case.
Is this behavior expected? If so maybe the documentation section https://cython.readthedocs.io/en/latest/src/userguide/external_C_code.html#acquiring-and-releasing-the-gil could be updated to mention conditional acquiring of GIL.
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (4 by maintainers)
Top GitHub Comments
I think the best solution is to get rid of the try-finally node and do the cleanup in
FuncDefNode.generate_function_definitions()
, acquiring the GIL only when necessary for a given exit case (error or success).Note that you do not need to write the
with gil
in this specific example since it’s implicit in theraise
statement (i.e. you can raise exceptions also from within nogil sections). That generates the same code aswith gil
, though, so will run into the same problem.Looking at the C code that Cython generates from the example that you linked to, the difference is that the function that uses
with gil
has a Python temp variable, which needs to be cleaned up at the end in the exception case. For that, it needs to acquire the GIL. However, it wouldn’t need the GIL in the success case, and still acquires it. That makes it uselessly hammer on the GIL in very fast functions like this.The GIL acquisition is unconditionally inserted here, as a try-finally statement wrapping the whole function body: https://github.com/cython/cython/blob/f09e61ab721ad51526ec7a6798fc01d8346f539d/Cython/Compiler/ParseTreeTransforms.py#L1861-L1872
and gets released here, after generating the function code: https://github.com/cython/cython/blob/f09e61ab721ad51526ec7a6798fc01d8346f539d/Cython/Compiler/Nodes.py#L2145-L2148
I don’t know how difficult it would be to detect the needless acquire and avoid it, but in any case, I consider it a bug that this happens unconditionally. in the success case, it should run through the function without touching the GIL at all.