question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Improve the performance of `containsExactly` from quadratic to linear complexity

See original GitHub issue

Summary

The performance of the containsExactly function of Iterables is currently quadratic, while it could be linear. It seems to be because of the IterableDiff class it uses, which checks the extra and missing elements with an algorithm suitable for checking the elements in any order. However, this is not necessary, since they have to be present in the exact order and so a more efficient linear solution could look something like this:

boolean compare(expected, actual) {
    if (expected.size() != actual.size()) return false;
    for (int i = 0; i < expected.size(); i++) {
       if (expected.get(i) != actual.get(i)) return false;
    }
    return true;
}

We ran a scalability analysis on various assertions, which you can find here if you are interested.

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:3
  • Comments:27 (27 by maintainers)

github_iconTop GitHub Comments

2reactions
joel-costigliolacommented, Apr 24, 2022

containsExactly is not a good fit for some Set like HashSets (but it’s fine for SortedSets or LinkedHashSet), on this one we expect users to know the semantics of the iterables being tested. If the iteration order is not consistent then the assertion will likely fail and that is fine since the iterable did not honor the assertion contract.

containsExactlyInAnyOrder can be used with HashSet aspointed out by @sarajuhosova.

Plus, it seems that that IterableDiff_Test lacks testing with iterables that aren’t ArrayLists where the ordering could be different, like HashSets.

I agree that we could have used different iterable types but only those with consistent iteration order (so no HashSet).

In my experience we should not rely on HashSet iteration even though it could look consistent, I remember on one project tests that relied on this and failed once we migrated to a newer version of java that changes the underlying HashSet iteration order.

So moving forward, we should continue treating iterable as if they had a consistent iteration order for containsExactly and let users the responsibility to choose the correct assertion for HashSet, what we can do though is to enhance the javadoc with a warning to prefer containsExactlyInAnyOrder over containsExactly for HashSet or any iterable whose iteration order is not predictable.

Good discussion btw!

2reactions
joel-costigliolacommented, Apr 14, 2022

one comment about the issue is that we want to keep the error message as it is.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Improve the performance of `containsExactly` from quadratic to ...
Summary The performance of the containsExactly function of Iterables is currently quadratic, while it could be linear. It seems to be because of...
Read more >
[PDF] Linear and Quadratic Complexity Bounds on the Values ...
PDF | In this paper we review the existing linear and quadratic complexity (upper) bounds on the values of the positive roots of...
Read more >
A linear time algorithm for the Koopmans–Beckmann QAP ...
A linear time algorithm is provided to solve the linearization problem for QAP-KB and MAP, improving the best-known algorithms for these problems.
Read more >
DISTRIBUTION COMPRESSION IN NEAR-LINEAR TIME
A direct application of COMPRESS++ to KT improves its quadratic Θ(n2) runtime to near linear O(n log3 n) time while preserving its error...
Read more >
arXiv:2111.07941v6 [stat.ML] 18 Oct 2022
A direct application of COMPRESS++ to KT improves its quadratic Θ(n2) runtime to near linear O(n log3 n) time while preserving its error ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found