Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Flaky tests and circuit operation selection

See original GitHub issue

Introduction

Several tests, including test_clifford_circuit_2 in Cirq/cirq-core/cirq/sim/clifford/clifford_simulator_test.py and test_example_runs_bb84 in Cirq/examples/examples_test.py, seem to be flaky when all seed-setting code (e.g. np.random.seed(0) or tf.random.set_seed(0)) is commented out.

For instance, in commit 8cef3d9dc16b27e3b10184e1b72afa764efe590d (version 0.11.0), test_clifford_circuit_2[qubits0] and test_clifford_circuit_2[qubits1] fail ~24% and ~30% of the time (each out of 500 runs) compared to 0% of the time (each out of 500 runs) when no seed-setting code is removed. Similarly, test_example_runs_bb84 fails ~32% of the time (out of 500 runs) compared to 0% of the time (out of 500 runs) when no seed-setting code is removed.

test_clifford_circuit_2 tests the Clifford circuit simulator while test_example_runs_bb84 tests the example implementation of the BB84 QKD protocol.

Motivation

Some tests can be flaky with high failure rates, but are not discovered when the seeds are set. We are trying to stabilize such tests.

Environment

The tests were run using pytest 6.2.2 in a conda environment with Python 3.6.13. The OS used was Ubuntu 16.04.

Discussion

The reason for the flakiness appears to be the use of seed-setting code when selecting circuit operations. For example, test_clifford_circuit_2 checks the value of sum(result.measurements['0'])[0] at the end of the test (and ensures it is between 20 and 80). However, the value of sum(result.measurements['0'])[0] is always 49 when the seed-setting code is not removed since the circuit being simulated remains the same from run to run. (On the other hand, when the seed-setting code is removed, the circuit being simulated does not remain the same and, hence, the value of sum(result.measurements['0'])[0] is not always between 20 and 80.) test_example_runs_bb84 is flaky when seed-setting code is removed for a similar reason.

We would be interested in learning if setting the seed enables the selection of a restricted class of circuit operations. We would also be interested in learning if there are any ways of addressing the seed-setting code in the test. We will be happy to raise a Pull Request to fix the tests and incorporate any feedback that you may have.

Issue Analytics

State:
Created 2 years ago
Comments:8 (3 by maintainers)

Top GitHub Comments

1reaction

95-martin-orioncommented, Jul 21, 2021

Would saving the current circuit in a test configuration file be a good way of resolving the flakiness?

This seems reasonable, although we should first confirm that the current circuit generates a Bell state (i.e. an equal superposition of |0) and |1)).

Additionally, it does seem that the assertion should be adjusted to check if sum(result.measurements['0'])[0] has a value of 49, rather than checking if the value is between 20 and 80.

I’d prefer to keep the looser assertion on this, even though specifying a seed will enforce a specific result. The reason for this is that we want the test to capture our expectations: if the circuit produces a Bell state, measuring from it has a 50% chance of producing a zero - so our expectation is that the number of zeros measured is e.g. in the range 40 < x < 60.

@melonwater211, would you like to take on fixing this issue?

0reactions

MichaelBroughtoncommented, Mar 28, 2022

I think we can close now since our tests now have a more fixed random seed and the function now uses a fixed random generator for that test.