question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Simple Python exception causes segmentation fault on vcs

See original GitHub issue

We have been struggling with random segmentation faults with our verification suite and VCS and until now we had assumed it was a problem with our code. However, recently I have noticed similar behaviour with a very simple test case which appears to be a problem with cocotb itself.

Let’s say I have the following simple test:

import cocotb

@cocotb.test()
def check_top_level(dut):
    """Checks the simulation was built and can be clocked."""
    yield do_something(dut)

Where in another file I have this:

import cocotb
# Note I have commented out this import on purpose...
# from cocotb.triggers import Timer

@cocotb.coroutine
def simple_coroutine(dut):
    yield Timer(500)

@cocotb.coroutine
def do_something(dut):
    # dut.i_bank_rst__b_a = 0    # Uncomment me to cause a segfault
    yield simple_coroutine(dut)

Now, if you run this as-is you will get an exception due to Timer not being resolved. However, if you access any of the verilog itself (e.g uncomment the line that resets a reset signal in our design), then what would normally be thrown as an exception now causes a segfault. This seems to be an issue with the embedded Python interpreter shutting down before the exception is printed and then Python tries to access invalid memory and causes a segfault (at least that is our guess).

We have seen similar issues (segfault or a lockup) happen when a Python exception happens after we have accessed the verilog, but before the first yield of a coroutine. It seems to be related to the embedded Python in cocotb doing something wrong when shutting down.

I think if we could get to the root of this issue in cocotb, we could then begin to get to the real root of our segfaults - which could be a simple Python exception coming from somewhere.

Issue Analytics

  • State:open
  • Created 4 years ago
  • Reactions:1
  • Comments:44 (19 by maintainers)

github_iconTop GitHub Comments

2reactions
Alex-Manncommented, Jul 11, 2020

I was hitting the same issue with VCS and can confirm that either of the following fixes the issue if put at the end of the sequence:

await RisingEdge(tb.clk)
await Timer(1, units='ns')
1reaction
marlonjamescommented, Jan 17, 2020

I think on most simulators this is when the sim ends as stop_simulator never returns on Questa or Icarus. But Py_Finalize isn’t called until the shutdown callback is run. Can you put in some debugging prints to ensure it is being run?

Icarus and Aldec Active-HDL return from gpi_sim_end, which is why I added the sim_ending flag in simulatormodule.c and set it in stop_simulator. Once the cocotb scheduler returns back into handle_gpi_callback, gpi_sim_end is called which eventually calls Py_Finalize.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to recover from Segmentation fault in Python?
A segmentation fault might be a bug in that C module you are calling from you Python program. If you own the code...
Read more >
Identify what's causing segmentation faults (segfaults)
A segmentation fault (aka segfault) is a common condition that causes programs to crash; they are often associated with a file named core...
Read more >
Segfault using Sage function with try...except AlarmInterrupt
Segmentation fault is a reaction to SEGV (segmentation violation) signal which is sent to a process by the operating system when the process ......
Read more >
error message - Shortest code that raises a SIGSEGV
Thanks to Alex A. and Mego, it is confirmed to cause segmentation faults on Mac and Linux systems as well. Python is the...
Read more >
Segmentation Fault (SIGSEGV) vs Bus Error (SIGBUS)
1) Segmentation Fault (also known as SIGSEGV and is usually signal 11) occur when the program tries to write/read outside the memory allocated ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found