Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

eigh() tests fail to pass, crash Python with seemingly ramdom pattern

See original GitHub issue

This problem is related to #11601, which has been closed by #11702 ( @ilayn ). However, the crash has not been fixed by the latter PR.

The symptoms remained almost identical to the one described in my comment in https://github.com/scipy/scipy/issues/11601#issuecomment-600153321

In summary, when running the test for eigh(), Python tends to crash with SIGSEGV or SIGABRT. Sometimes this happens during the test_eigh() function, sometimes after it passed with “100%” but before pytest returns.

The test that triggers the crash is the following test function:

https://github.com/scipy/scipy/blob/ae34ce4835949a8310d7c3d7bcb4a55aafd11f4f/scipy/linalg/tests/test_decomp.py#L863-L888

Some patterns from the histories of crashes

I run the test script with runtests.py 100 times and saved the output as text files.

By grepping the output files ./runtests.py, I notice that the last-known position in Python before it crashes could be three lines, namely 873, 876, and 877. L 873 is the actual call to eigh(), while the crash can happen as late as 876 or 877, where the arrays returned from eigh() are accessed.

Only 6 out of 100 runs passed without any problems.

In some cases (35 out of the 100), Python segfaults after nominally completing all the tests in TestEigh::test_eigh.

In the cases where Python was killed with SIGABRT, 36 were at L 873 (call to eigh()), while 9 were at L 876 where output z was used. In many other runs, the test script was not featured in the Python backtrace if any.

The parametrized inputs that triggered the crash were of the form test_eigh[6-D-XXX-YYY-ZZZ-eigvals1]. That is, the crashes happened for dimension 6, dtype double complex, with eigvals= keyword parameter set to the tuple (2, 4). The XXX–ZZZ parameters are boolean flags for keywords turbo, lower, and overwrite respectively.

An incomplete tally of the parameters (turbo, lower, and overwrite), where Python crashed before finishing all the tests, is as follows:

   5 False-False-False
  11 False-False-True
  13 False-True-False
   6 False-True-True
   7 True-False-False
   4 True-False-True
  15 True-True-True

The combination (turbo=True, lower=True, overwrite=False) is the one missing from the 2^3 = 8 cases yet.

Reproducing code example:

./runtests.py -vt scipy/linalg/tests/test_decomp.py::TestEigh::test_eigh

Scipy/Numpy/Python version information:

Scipy master branch as of ae34ce48, Numpy 1.18.1, Python 3.7.6, conda macos with MKL 2019.4.

Issue Analytics

State:
Created 3 years ago
Comments:52 (52 by maintainers)

Top GitHub Comments

2reactions

ilayncommented, Apr 29, 2020

Intel team confirmed the bug and included the fix for the upcoming MKL 2020 update 2.

2reactions

oleksandr-pavlykcommented, Apr 6, 2020

@ilayn Done.

Top Results From Across the Web

debugging - What's the toughest bug you ever found and fixed ...

The toughest bug I ever had to fix was one I'd raised myself - I contracted as a tester for a large telco,...

Changelog — Hypothesis 6.60.0 documentation

This patch fixes issue #2657, where passing unicode patterns compiled with re.IGNORECASE to from_regex() could trigger an internal error when casefolding a ...

Preempting Flaky Tests via Non-Idempotent-Outcome Tests

First, testing frameworks, such as JUnit, do not mandate the order in which tests are run, and test suites that pass in one...

Build a Hash Table in Python With TDD

Take a Crash Course in Test-Driven Development ... The language also has a global hash() function, used primarily for quick element lookup ...

Common Error Messages - Sauce Labs Documentation

Below are some Sauce Labs automated testing common error messages and how to fix them. Mobile and Web App Testing. Abuse Job. Description....