Test tolerances are too tight, resulting in 6 test failures
See original GitHub issueI’m seeing the following failures:
=========================== short test summary info ============================
FAILED haiku/_src/integration/jax_transforms_test.py::JaxTransformsTest::test_jit_Recurrent(Conv1DLSTM, dynamic_unroll)
FAILED haiku/_src/integration/jax_transforms_test.py::JaxTransformsTest::test_jit_Recurrent(Conv1DLSTM, static_unroll)
FAILED haiku/_src/integration/jax_transforms_test.py::JaxTransformsTest::test_jit_Recurrent(Conv2DLSTM, dynamic_unroll)
FAILED haiku/_src/integration/jax_transforms_test.py::JaxTransformsTest::test_jit_Recurrent(Conv2DLSTM, static_unroll)
FAILED haiku/_src/integration/jax_transforms_test.py::JaxTransformsTest::test_jit_Recurrent(Conv3DLSTM, dynamic_unroll)
FAILED haiku/_src/integration/jax_transforms_test.py::JaxTransformsTest::test_jit_Recurrent(Conv3DLSTM, static_unroll)
===== 6 failed, 2461 passed, 143 skipped, 57 warnings in 277.28s (0:04:37) =====
when running on an m6a.4xlarge EC2 instance (3rd generation AMD EPYC processors). For a full log see here.
It appears to me that the tests are working as intended, but the tolerances are too tight, eg.
> return treedef.unflatten(f(*xs) for xs in zip(*all_leaves))
E AssertionError:
E Not equal to tolerance rtol=1e-07, atol=1e-05
E
E Mismatched elements: 91 / 108 (84.3%)
E Max absolute difference: 0.001465
E Max relative difference: 0.0834
E x: array([[[ 0.1921 , -0.168 , -0.425 , -0.1724 , 0.1691 ,
E -0.11523 , -0.3555 , 0.4094 , 0.2556 , 0.06256 ,
E 0.187 , 0.4253 ],...
E y: array([[[ 0.193 , -0.1675 , -0.424 , -0.1721 , 0.1694 ,
E -0.1148 , -0.3547 , 0.4097 , 0.2566 , 0.0632 ,
E 0.1871 , 0.426 ],...
To reproduce:
- Checkout https://github.com/NixOS/nixpkgs/commit/672f7edb4d7a14e6d2e8026f49c66be270818b0a on an m6a EC2 instance.
- Run
nix-build -A python3Packages.dm-haiku
This is with
- Python 3.9.11
- dm-haiku 0.0.6
- jax 0.3.4
- jaxlib 0.3.0
- openblas 0.3.20
- TF 2.8.0
- Chex at https://github.com/deepmind/chex/commit/5adc10e0b4218f8ec775567fca38b68bbad42a3a.
Issue Analytics
- State:
- Created a year ago
- Comments:7
Top Results From Across the Web
Lab Report 5: Setting Tolerance Limits for Pipettes - Artel
Tolerances that are too strict can cause a large number of so-called “false failures,” where a pipette in good working order produces test...
Read more >Building from source failed test · Issue #8590 · google/jax
During the test phase, I have two failed tests: ... indicate that the test tolerances are a bit too tight and changes in...
Read more >Tolerance interval testing for assessing accuracy and ... - NCBI
Tolerance intervals have been recommended for simultaneously validating both the accuracy and precision of an analytical procedure.
Read more >The Challenge of Holding Tight Rockwell Test Result ...
Manufacturers sometimes impose tight Rockwell hardness specifications on themselves in an attempt to improve quality.
Read more >Why are Tolerances Important in Manufacturing?
Failure to define even just one important tolerance measurement can lead to product failure down the line. Lack of clear direction: Defining all ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Internally we’re testing with JAX/XLA at HEAD so I’m fairly confident they pass with the latest stable release too. I’ll bump the versions we’re using on GHA regardless in #370 since we should be running with something more recent (I’ll stick with 0.3.5 since we have corresponding jaxlib release).
Interesting… I’m guessing it’s an AMD vs Intel discrepancy