Faster union/intersection implementation for sets?
See original GitHub issueCurrent union (addAll) implementation for sets is O(logN * M) as far as I understand even when both sides have the same implementation.
It is certainly possible to improve that to log or even sub-log implementation for HAMTs by merging one-bit arrays only.
It is also possible to improve intersection, although atm you don’t even have a method for intersection in the API.
Not sure if it helps maps in any way, but I guess it does.
While intersection is questionable, union is a very important and widely used operation (especially so when + is overloaded as union) for persistent sets and implementing this doesn’t even need changing the api.
Issue Analytics
- State:
- Created 3 years ago
- Comments:10 (10 by maintainers)
Top Results From Across the Web
Fastest Set Operations In The West - Stack Overflow
I haven't been able to find any satisfactory coverage of this topic all in one place, so I was wondering: What are the...
Read more >Fast Set Intersection in Memory - arXiv
ABSTRACT. Set intersection is a fundamental operation in information retrieval and database systems. This paper introduces linear space data struc-.
Read more >Experimenting with Fast Private Set Intersection | SpringerLink
This paper considers one PSI construct from [DT10] and reports on its optimized implementation and performance evaluation. Several key implementation choices ...
Read more >Faster Set Intersection with SIMD instructions by Reducing ...
The experimental results show our algorithm outperforms the std::set_intersection implementation delivered with gcc by up to 5.2x using SIMD instructions and by ...
Read more >Sets in Java and Intersection, Difference, Union - YouTube
Discuss how to implement sets in the Java Collections Framework Set interface and HashSet implemtation. Describe how to find interesection, ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

The asked optimisation was implemented by @belyaev-mikhail and merged to
master(690343ace1aa4a47463f2ce42ae319876014ee47).Ok, so I implemented the array element checking, did some benchmarks and it does not seem to introduce much overhead (or any at all, for that matter).