question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug][Test] `test_module_sign` occasionally fails

See original GitHub issue

🐛 Bug

tests/compute/test_transform.py::test_module_sign[g0] occasionally fails in Torch CPU/ Torch GPU/Windows CPU unit tests. For example:

cc @mufeili

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
mufeilicommented, Jun 29, 2022

The error occurs again in http://dgl-jenkins-eksvpc-2136217999.us-west-2.elb.amazonaws.com/blue/organizations/jenkins/dgl/detail/PR-4183/1/pipeline/572. @mufeili

From the error message, it does seem to be a precision issue. Perhaps making the threshold larger in torch.allcose will address the issue.

0reactions
yaox12commented, Jun 29, 2022

~~I think this could be caused by matmul precision in PyTorch. https://pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-devices.~~

Starting in PyTorch 1.7, there is a new flag called allow_tf32. This flag defaults to True in PyTorch 1.7 to PyTorch 1.11, and False in PyTorch 1.12 and later.

We probably should set torch.backends.cuda.matmul.allow_tf32 = False for this test. cc @nv-dlasalle

But this can’t answer why it fails in CPU test.

Read more comments on GitHub >

github_iconTop Results From Across the Web

No results found

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found