Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to test visualization?

See original GitHub issue

We need to improve the way we test visualization, that’s out of discussion. However, it is not fully obvious how.

Compare images

The way we compare images is implemented in QiskitVisualizationTestCase.assertImagesAreEqual (test/python/visualization/visualization.py). As defined, it is very unstable. How images are generated very much depend on non-controlled factors, like the available fonts. At the same time, it seems that is not sensitive enough. Tolerating some difference (in order to handle uncontrolled factors) makes relevant differences hard to detect. If we are willing to reduce the tolerance to the point that semantic changes are visible, we need to consider the CI as the “ground truth”. For that, we need to save unmatching images and update the references (using PublishBuildArtifacts like in here)

Mock the drawing libraries

For the latex drawer, comparing the latex source seems the way to go. For the matplotlib case, we should be able to mock matplotlib.figure.Figure. But I dont know how complicated that can be.

Any other idea?

Issue Analytics

State:
Created 4 years ago
Reactions:1
Comments:13 (13 by maintainers)

Top GitHub Comments

1reaction

ajavadiacommented, Aug 29, 2019

^ ditto. We are not testing against any ground “truth”, just against some status quo. Catching actual visualization bugs have always come down to a user noticing something is off, and reporting it. So why not just remove these painful tests?

1reaction

mtreinishcommented, Aug 28, 2019

So my issue with image comparison tests, beyond their fickleness based on a lot of environmental factors, including but not limited to the mpl backend (which is hardcoded for testing in #2949), is that they’re not actually testing things are correct. Image comparison tests just test the status quo which may or may not be correct. We’re encoding the behavior of the current output in our reference images not actually what we view as a correct. We’ve had instances in the past where we’ve had a bug with barriers in the reference images and had no idea. Another perfect example is #3052 which if we had latex image comparison tests (or latex source comparison tests) would fail. Even though the output with #3052 is objectively more correct in pretty much every case the tests would fail. The tests do not tell us if we have a bug or not, they just indicate when we’ve changed something, which doesn’t seem like much of a value add. Especially when weighed against their general instability.