Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Test tolerances are too tight, resulting in 6 test failures

See original GitHub issue

I’m seeing the following failures:

=========================== short test summary info ============================
FAILED haiku/_src/integration/jax_transforms_test.py::JaxTransformsTest::test_jit_Recurrent(Conv1DLSTM, dynamic_unroll)
FAILED haiku/_src/integration/jax_transforms_test.py::JaxTransformsTest::test_jit_Recurrent(Conv1DLSTM, static_unroll)
FAILED haiku/_src/integration/jax_transforms_test.py::JaxTransformsTest::test_jit_Recurrent(Conv2DLSTM, dynamic_unroll)
FAILED haiku/_src/integration/jax_transforms_test.py::JaxTransformsTest::test_jit_Recurrent(Conv2DLSTM, static_unroll)
FAILED haiku/_src/integration/jax_transforms_test.py::JaxTransformsTest::test_jit_Recurrent(Conv3DLSTM, dynamic_unroll)
FAILED haiku/_src/integration/jax_transforms_test.py::JaxTransformsTest::test_jit_Recurrent(Conv3DLSTM, static_unroll)
===== 6 failed, 2461 passed, 143 skipped, 57 warnings in 277.28s (0:04:37) =====

when running on an m6a.4xlarge EC2 instance (3rd generation AMD EPYC processors). For a full log see here.

It appears to me that the tests are working as intended, but the tolerances are too tight, eg.

>   return treedef.unflatten(f(*xs) for xs in zip(*all_leaves))
E   AssertionError: 
E   Not equal to tolerance rtol=1e-07, atol=1e-05
E   
E   Mismatched elements: 91 / 108 (84.3%)
E   Max absolute difference: 0.001465
E   Max relative difference: 0.0834
E    x: array([[[ 0.1921  , -0.168   , -0.425   , -0.1724  ,  0.1691  ,
E            -0.11523 , -0.3555  ,  0.4094  ,  0.2556  ,  0.06256 ,
E             0.187   ,  0.4253  ],...
E    y: array([[[ 0.193   , -0.1675  , -0.424   , -0.1721  ,  0.1694  ,
E            -0.1148  , -0.3547  ,  0.4097  ,  0.2566  ,  0.0632  ,
E             0.1871  ,  0.426   ],...

To reproduce:

Checkout https://github.com/NixOS/nixpkgs/commit/672f7edb4d7a14e6d2e8026f49c66be270818b0a on an m6a EC2 instance.
Run nix-build -A python3Packages.dm-haiku

This is with

Python 3.9.11
dm-haiku 0.0.6
jax 0.3.4
jaxlib 0.3.0
openblas 0.3.20
TF 2.8.0
Chex at https://github.com/deepmind/chex/commit/5adc10e0b4218f8ec775567fca38b68bbad42a3a.

Issue Analytics

State:
Created a year ago
Comments:7

Top GitHub Comments

1reaction

tomhennigancommented, Apr 13, 2022

Internally we’re testing with JAX/XLA at HEAD so I’m fairly confident they pass with the latest stable release too. I’ll bump the versions we’re using on GHA regardless in #370 since we should be running with something more recent (I’ll stick with 0.3.5 since we have corresponding jaxlib release).

0reactions

samuelacommented, May 5, 2022

Interesting… I’m guessing it’s an AMD vs Intel discrepancy

Top Results From Across the Web

Lab Report 5: Setting Tolerance Limits for Pipettes - Artel

Tolerances that are too strict can cause a large number of so-called “false failures,” where a pipette in good working order produces test...

Building from source failed test · Issue #8590 · google/jax

During the test phase, I have two failed tests: ... indicate that the test tolerances are a bit too tight and changes in...

Tolerance interval testing for assessing accuracy and ... - NCBI

Tolerance intervals have been recommended for simultaneously validating both the accuracy and precision of an analytical procedure.

The Challenge of Holding Tight Rockwell Test Result ...

Manufacturers sometimes impose tight Rockwell hardness specifications on themselves in an attempt to improve quality.

Why are Tolerances Important in Manufacturing?

Failure to define even just one important tolerance measurement can lead to product failure down the line. Lack of clear direction: Defining all ......