Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[RFC] New Testers Proposal

See original GitHub issue

This is a proposal for a new testers API, and supersedes issues #551 and #547. Nothing is currently set in stone, and feedback from the general Chisel community is desired. So please give it a read and let use know what you think!

Motivation

What’s wrong with Chisel BasicTester or HWIOTesters?

The BasicTester included with Chisel is a way to define tests as a Chisel circuit. However, as testvectors often are specified linearly in time (like imperative software), this isn’t a great match.

HWIOTesters provide a peek/poke/step API, which allows tests to be written linearly in time. However, there’s no support for parallelism (like a threading model), which makes composition of concurrent actions very difficult. Additionally, as it’s not in the base Chisel3 repository, it doesn’t seem to see as much use.

HWIOTesters also provides AdvancedTester, which allows limited background tasks to run on each cycle, supporting certain kinds of parallelism (for example, every cycle, a Decoupled driver could check if the queue is ready, and if so, enqueue a new element from a given sequence). However, the concurrent programming model is radically different from the peek-poke model, and requires the programmer to manage time as driver state.

And finally, having 3 different test frameworks really kind of sucks and limits interoperability and reuse of testing libraries.

Goal: Unified testing

The goal here is to have one standardized way to test in Chisel3. Ideally, this would be:

suitable for both unit tests and system integration tests
designed for composable abstractions and layering
able to target multiple backends and simulators (possibly requiring a link to Scala, if the testvector is not static, or using a limited test constructing API subset, when synthesizing to FPGA)
included in base chisel3, to avoid packaging and dependency nightmares
highly usable, encouraging unit tests by making it as easy, painless (avoiding boilerplate and other nonsense), and useful as possible to write them

Proposal

Testdriver Construction API

This will define an API for constructing testdriver modules.

Basic API

These are the basic conceptual operations:

Peek: returns the value of a circuit node
Check: asserts that a circuit node has some value, Similar semantics to peek (details below)
Poke: pokes a value into a circuit node
Step: blocks until the next rising edge of the specified clock (for single-clock designs, equivalent to stepping the clock) Note: A better name is desired for this…

A subset of this API (poke, check, step) that is synthesizable to allow the generation of testbenches that don’t require Scala to run with the simulator.

Values are specified and returned as Chisel literals, which is expected to interoperate with the future bundle literal constructors feature. In the future, this may be relaxed to be any Chisel expression.

Peek, check, and poke will be defined as extensions of their relevant Chisel types using the PML (implicit extension) pattern. For example, users would specify io.myUInt.poke(4.U), or io.myUInt.peek() would return a Chisel literal containing the current simulation value.

This is to combine driver code with their respective Bundles, allowing these to be shared and re-used without being tied to some TestDriver subclass. For example, Decoupled might define a pokeEnqueue function which sequences the ready, valid, and bits wires and can be invoked with io.myQueue.pokeEnqueue(4.U). These can then be composed, for example, a GCD IO with Decoupled input and output might have gcd.io.checkRun(4, 2, 2) which will enqueue (4, 2) on the inputs and expect 2 on the output when it finishes.

Pokes retain their values until updated by another poke.

Concurrency Model

Concurrency is provided by fork-join parallelism, to be implemented using threading. Note: Scala’s coroutines are too limited to be of practical use here.

Fork: spawns a thread that operates in parallel, returning that thread. Join: blocks until all the argument threads are completed.

Combinational Peeks and Pokes

There are two proposals for combinational behavior of pokes, debate is ongoing about which model to adopt, or if both can coexist.

Proposal 1: No combinational peeks and pokes

Peeks always return the value at the beginning of the cycle. Alternatively phrased, pokes don’t take effect until just before the step. This provides both high performance (no need to update the circuit between clock cycles) and safety against race conditions with threaded concurrency (because poke effects can’t be seen until the next cycle, and all testers are synchronized to the clock cycle, but not synchronized inbetween).

One issue would be that peeks can be written after pokes, but they will still return the pre-poke value, but this can be handled with documentation and possibly optional runtime checks against “stale” peeks. Additionally, this makes it impossible to test combinational logic, but this can be worked around with register insertion.

Note that it isn’t feasible to ensure all peeks are written before pokes for composition purposes. For example, Decoupled.pokeEnqueue may peek to check that the queue is ready before poking the data and valid, and calling pokeEnqueue twice on two different queues in the same cycle would result in a sequence of peek, poke, peek, poke.

Another previous proposal was to allow pokes to affect peeks, but to check that the result of peeks are still valid at the end of the cycle. While powerful, this potentially leads to brittle and nondeterministic testing libraries and is not desirable.

Proposal 2: Combinational peeks and pokes that do not cross threads

Peeks and pokes are resolved in the order written (combinational peeks and pokes are allowed and straightforward). Pokes may not affect peeks from other threads, and this is checked at runtime using reachability analysis.

This provides easy testing of combinational circuits while still allowing deterministic execution in the presence of threading. Since pokes affecting peeks is done by combinational reachability analysis (which is circuit-static, instead of ad-hoc value change detection), thread execution order cannot affect the outcome of a test. Note that clocks act as a global synchronization boundary on all threads.

One possible issue is whether such reachability analysis will have a high false-positive rate. We don’t know right now, and this is something we basically have to implement and see.

Efficient simulation performance is possible by using reachability analysis to determine if the circuit needs to be updated between a poke and peek. Furthermore, it may be possible to determine if only a subset of the circuit needs to be updated.

Multiclock Support

This section is preliminary.

As testers only synchronize to an external clock, a separate thread can drive clocks in any arbitrary relationship.

This is the part which has seen the least attention and development (so far), but robust multiclock support is desired.

Backends

First backend will be FIRRTerpreter, because Verilator compilation is slow (probably accounts for a significant fraction of time in running chisel3 regressions) and doesn’t support all platforms well (namely, Windows).

High performance interfaces to Verilog simulators may be possible using Java JNI to VPI instead of sockets.

Conflicting Drivers

This section is preliminary.

Conflicting drivers (multiple pokes to the same wire from different threads on the same cycle, even if they have the same value) are prohibited and will error out.

There will probably be some kind of priority system to allow overriding defaults, for example, pulling a Decoupled’s valid low when not in use.

Some test systems have a notion of wire ownership, specifying who can drive a wire to prevent conflicts. However, as this proposal doesn’t use an explicit driver model (theoretically saving on boilerplate code and enabling concise tests), this may not be feasible.

Misc

No backwards compatibility. As all of the current Chisel testers are extremely limited in capability, many projects have opted to use other testing infrastructure. Migrating existing test code to this new infrastructure will require rewriting. Existing test systems will be deprecated but may continue to be maintained in parallel.

It may be possible to create a compatibility layer that exposes the old API.

Mock construction and blackbox testing. This API may be sufficient to act as a mock construction API, and may enable testing of black boxes (in conjunction with a Verilog simulator).

Examples

Decoupled, linear style

implicit class DecoupledTester[T](in: Decoupled[T]) {
  // Alternatively, this could directly be in Decoupled
  def enqueue(data: T) {
    require(in.ready, true.B)
    in.valid.poke(true.B)
    in.bits.poke(data)
    step(1)
    in.valid.poke(false.B, priority=low)
  }
}

// Testdriver is a subclass of Module, which must be called from a Tester environment, 
// Example DUT-as-child structure
class MyTester extends Testdriver {
  val myDut = Module(new MyModule())
  // myModule with IO(new Bundle {
  //  val in = Flipped(Decoupled(UInt(8.W)))
  //  val out = Decoupled(UInt(8.W))  // transaction of in + crtl
  //  val in2 = Flipped(Decoupled(UInt(8.W)))
  //  val out2 = Decoupled(UInt(8.W))  // transaction of in + ctrl
  //  val ctrl = UInt(8.W)
  //} )

  myDut.io.in.enqueue(42.U)  // steps a cycle inside
  myDut.io.out.dequeueExpect(43.U)  // waits for output valid, checks bits, sets ready, step
  myDut.io.ctrl.poke(2.U)  // .poke added by PML to UInt
  myDut.io.in.enqueue(45.U)
  myDut.io.out.dequeueExpect(47.U)

  // or with parallel constructs
  myDut.io.ctrl.poke(4.U)

  join(fork {
    myDut.io.in.enqueue(44.U)
    myDut.io.out.dequeueExpect(48.U)
    myDut.io.in.enqueue(46.U)
    myDut.io.out.dequeueExpect(50.U)
  } .fork {  // can be called on a thread-list, spawns a new thread that runs in parallel with the threads on the list - lightweight syntax for spawning many parallel threads
    myDut.io.in2.enqueue(1.U)
    myDut.io.out2.dequeueExpect(5.U)
    myDut.io.in2.enqueue(7.U)
    myDut.io.out2.dequeueExpect(11.U)
  })
  // tester ends at end of TestDriver and when all spawned threads completed
}

External Extensions

These items are related to testing, but are most orthogonal and can be developed separately. However, they will be expected to interoperate well with testers:

SystemVerilog Assertions (basically LTL on circuits)
Constrained random generation
Memory initialization

Issue Analytics

State:
Created 6 years ago
Reactions:3
Comments:54 (36 by maintainers)

Top GitHub Comments

2reactions

seldridgecommented, Mar 16, 2018

I’ll give the rest of this a read and provide some feedback.

@ducky64: This came up when going through the generator bootcamp with some questions related to multiclock testing. This was the solution that I came up with. It defines a MultiClockTest with an abstract member def clocks: Seq[ClockInfo]. ClockInfo defines a mapping of each clock to a period and phase. Based on the period/phases specified, this generates clocks for you and connects them.

Multiclock module:

import chisel3._
import chisel3.experimental.withClock

class MultiClockModule extends Module {
  val io = IO(
    new Bundle {
      val clockA = Input(Clock())
      val clockB = Input(Clock())
      val clockC = Input(Clock())
      val a = Output(Bool())
      val b = Output(Bool())
      val c = Output(Bool())
    })

  /* Make each output (a, b, c) toggle using their respective clocks
   * (clockA, clockB, clockC) */
  Seq(io.clockA, io.clockB, io.clockC)
    .zip(Seq(io.a, io.b, io.c))
    .foreach{ case (clk, out) => { withClock(clk) { out := RegNext(~out) } }}
}

Multiclock test:

import chisel3._
import chisel3.experimental.RawModule
import chisel3.util.Counter
import chisel3.testers.{BasicTester, TesterDriver}
import chisel3.iotesters.{PeekPokeTester, ChiselFlatSpec}

/** A description of the period and phase associated with a specific
  * clock */
case class ClockInfo(signal: Clock, period: Int, phase: Int = 0)

/** A clock generator of a specific period and phase */
class ClockGen(period: Int, phase: Int = 0) extends Module {
  require(period > 0)
  require(phase >= 0)
  val io = IO(
    new Bundle {
      val clockOut = Output(Bool())
    })

  println(s"Creating clock generation with period $period, phase $phase")

  val (_, start) = Counter(true.B, phase)
  val started = RegInit(false.B)
  started := started | start

  val (count, _) = Counter(started, period)
  io.clockOut := count >= (period / 2).U
}

trait MultiClockTester extends BasicTester {
  self: BasicTester =>

  /* Abstract method (you need to fill this in) that describes the clocks */
  def clocks: Seq[ClockInfo]

  /* The finish method is called just before elaboration by TesterDriver.
   * This is used to generate and connect the clocks defined by the
   * ClockInfo of this module. */
  override def finish(): Unit = {
    val scale = clocks
      .map{ case ClockInfo(_, p, _) => p / 2 == p * 2 }
      .reduce( _ && _ ) match {
        case true => 1
        case false => 2 }
    clocks.foreach{ case ClockInfo(c, p, ph) =>
      c := Module(new ClockGen(p * scale, ph * scale)).io.clockOut.asClock }
  }
}

class MultiClockTest(timeout: Int) extends BasicTester with MultiClockTester {

  /* Instantiate the design under test */
  val dut = Module(new MultiClockModule)

  /* Define the clocks */
  val clocks = Seq(
    ClockInfo(dut.io.clockA, 3),
    ClockInfo(dut.io.clockB, 7),
    ClockInfo(dut.io.clockC, 7, 2))

  val (countA, _) = Counter(dut.io.a =/= RegNext(dut.io.a), timeout)
  val (countB, _) = Counter(dut.io.b =/= RegNext(dut.io.b), timeout)
  val (countC, _) = Counter(dut.io.c =/= RegNext(dut.io.c), timeout)

  val (_, timeoutOccurred) = Counter(true.B, timeout)
  when (timeoutOccurred) {
    printf(p"In ${timeout.U} ticks, io.a ticked $countA, io.b ticket $countB, io.c ticked $countC\n")
    stop()
  }
}

class MultiClockSpec extends ChiselFlatSpec {

  "ClockGen" should "throw exceptions on bad inputs" in {
    Seq(() => new ClockGen(0, 0),
        () => new ClockGen(1, -1))
      .foreach( gen =>
        intercept[IllegalArgumentException] { Driver.elaborate(gen) } )
  }

  "MultiClockTest" should "work" in {
    TesterDriver.execute(() => new MultiClockTest(128))
  }
}

2018-03-09-181626_1000x600_scrot

1reaction

chickcommented, Mar 15, 2018

@ducky64 This is a late comment but it would be nice to include into this development the ability to test a DUT against a golden model. The golden model might be an earlier version of DUT that you want to ensure that it matches behavior. The golden model should also be implemented in Scala, perhaps as some sort of mock.