Fix flaky tests that are recently popping up

Since https://github.com/pytorch/vision/pull/4497 was merged, we’re observing a few tests that start randomly failing.

Before https://github.com/pytorch/vision/pull/4497, these tests were almost always using the same RNG state, which was set in a test that was run earlier in the test execution suite. Now that all tests are properly independent and that the RNG doesn’t leak, these tests run with a new RNG at each execution, and if they’re unstable they might fail.

(Note: this is a good thing; it’s better to know that they fail now rather than when submiting an unrelated PR, which is what happened in https://github.com/pytorch/vision/pull/3032#issuecomment-734829336)

For each of these tests we should find out whether the flakyness is severe or not. A simple solution is to parametrize the test over 100 or 1000 random seeds and check the failure rate. If the failure rate is reasonable we can just set a seed with toch.manual_seed(). If not, we should try to fix the test and make it more robust.

The list of tests so far is:

cc @pmeier

Issue Analytics

State:
Created 2 years ago
Reactions:1
Comments:11 (11 by maintainers)

Top GitHub Comments

2reactions

datumboxcommented, Oct 27, 2021

It’s the unstable sort. See https://github.com/pytorch/vision/pull/4766#issuecomment-952996259

I think it’s worth understanding why the open-source contributor couldn’t make the sort stable (he was facing seg fault if I remember correctly). Fixing the sort will fix lots of instability on the Detection models, so definitely worth while.

1reaction

NicolasHugcommented, Oct 27, 2021

It would be interesting to figure out whether the 6 failures correspond to a specific edge-case, but I wouldn’t spend too much time on it either.

It could just be some ties in the sorting (which is not a stable sort)?