Distributed tests fail
See original GitHub issueRecently while working on ignite.contrib.metrics.regression
tests, I was trying to add distributed tests to some metrics that didn’t have distributed tests. While i was running those tests when world_size > 1
the result computed by the metric implemented in ignite
differs from the one computed by sklearn
(if imported from sklearn) / numpy
. And here are examples for that:
-
Test wave hedges distance
_test_distrib_integration()
fails in case ofworld_size > 1
and if i didn’t commentall_gather()
in_test_distrib_compute()
it fails as well! -
Test fractional absolute error The same failures in the same two tests
_test_distrib_compute()
and_test_distrib_integration()
.
related: #1284
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (6 by maintainers)
Top Results From Across the Web
Troubleshooting - Distributed Load Testing on AWS
Issue: You are using an existing VPC and your tests fail with a status of Failed , resulting in the following error message:...
Read more >Distributed tests fail with pytest · Issue #41337 - GitHub
I'm getting test failures in the distributed tests. The failing tests are: DistributedDataParallelTest.test_accumulate_gradients_module ...
Read more >End-to-End Testing: Avoiding Software Failure in Distributed ...
End-to-end (E2E) testing is a technique which verifies that all the features of a system and its interconnected subsystems work as expected.
Read more >On Eliminating Error in Distributed Software Systems
It is impossible to rely on testing to eliminate errors. A test cannot show an absence of error. A test simply proves a...
Read more >The plan failed the 401(k) ADP and ACP nondiscrimination tests
Excess contributions result from plans failing to satisfy the ADP test and should be distributed to the applicable HCEs within 12 months following...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@sdesrozis I think we can leave this task to @KickItLikeShika .
@KickItLikeShika check here: https://pytorch.org/ignite/metrics.html#metrics-and-distributed-computations and here : https://pytorch.org/ignite/distributed.html#ignite-distributed-utils for the list of collective ops. Please, try to figure out how it works by your own and ask if you tried everything and you become blocked for sometime. Thanks !
Ah ah take a look, the metrics are not distributed. It means a reduction is needed in the implementation. I could fix it tomorrow.