question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Method `advanceIfNeeded` returns a count of skipped items

See original GitHub issue

Is your feature request related to a problem? Please describe.

My use-case is that I have a very large array of objects which contain an ID in their property and I need to reduce the array to a smaller one with elements which id match some requested set. I may do a binary search and use a “comparator” on id property in the original array, but each comparison will require “hop” to a memory location where the object resides to retrieve the id property (which is rather slow).

That’s why I created a RoaringBitmap with the ids that are extracted from the records in the large array and search ids in it using org.roaringbitmap.RoaringBatchIterator. The problem is that I need to “materialize” all records in RoaringBitmap via. nextBatch in order to count all examined elements of the RoaringBitmap. I’d like to use advanceIfNeeded approach to skip entire segments if the currently looked up id is not within them. The problem is that the advanceIfNeeded method doesn’t return the count of elements, that has been skipped which is neccessary for me to compute the correct index in the original “large array of fat objects”.

Describe the solution you’d like

Adding a new method or changing the return type of advanceIfNeeded to int (the latter would probably break backward binary compatibility of RoaringBitmaps) that would represent a number of elements that has been skipped during “fast forward” of the iterator.

Does it make sense to you?

Code snippet reflecting the situation

@Nonnull
default PriceRecordContract[] getPriceRecords(@Nonnull Bitmap priceIds) {
	final RoaringBatchIterator lookupIterator = RoaringBitmapBackedBitmap.getRoaringBitmap(priceIds).getBatchIterator();
	final RoaringBatchIterator superSetIterator = RoaringBitmapBackedBitmap.getRoaringBitmap(getIndexedPriceIds()).getBatchIterator();
	final PriceRecordContract[] priceRecords = getPriceRecords();

	final CompositeObjectArray<PriceRecordContract> filteredPriceRecords = new CompositeObjectArray<>(PriceRecordContract.class);
	int lastPriceId = 0;
	int superSetBatchPeek = 0;
	int lastSuperSetBatchRecord = 0;
	int relativeSuperSetIndex = -1;
	int[] lookupBatch = new int[512];
	int[] superSetBatch = new int[521];
	int superSetIteratorPeek = 0;

	while (lookupIterator.hasNext()) {
		final int lookupIteratorPeek = lookupIterator.nextBatch(lookupBatch);
		for (int i = 0; i < lookupIteratorPeek; i++) {
			int lookedUpId = lookupBatch[i];

			// fast-forward if the last super set batch record is lesser than looked up id
			while (lookedUpId > lastSuperSetBatchRecord && superSetIterator.hasNext()) {
				superSetBatchPeek += superSetIteratorPeek;
				superSetIteratorPeek = superSetIterator.nextBatch(superSetBatch);
				relativeSuperSetIndex = -1;
				if (superSetIteratorPeek > 0) {
					lastSuperSetBatchRecord = superSetBatch[superSetIteratorPeek - 1];
				} else {
					lastSuperSetBatchRecord = 0;
				}
			}
			
			// look for the index of price detail (searched block is getting smaller and smaller with each match)
			final int fromIndex = relativeSuperSetIndex + 1;
			final int toIndex = Math.min(fromIndex + lookedUpId - lastPriceId, superSetBatchPeek);

			// look for the price in currently read super set batch
			relativeSuperSetIndex = Arrays.binarySearch(superSetBatch, fromIndex, toIndex, lookedUpId);
			if (relativeSuperSetIndex < 0) {
				relativeSuperSetIndex = -1 * (relativeSuperSetIndex) - 2;
			} else {
				final PriceRecordContract priceRecord = priceRecords[superSetBatchPeek + relativeSuperSetIndex];
				filteredPriceRecords.add(priceRecord);
				lastPriceId = lookedUpId;
			}
		}
	}

	// return found prices as array
	return filteredPriceRecords.toArray();
}

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:9 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
lemirecommented, May 27, 2022

And therefore, when you advance over entire container, you know what has been skipped.

Yes. That part is easy. But you don’t typically skip over an entire containers. So think about a bitset… it is just an array of bits… 010001101110000001110000111000001. In general, the function you request requires you to count the bits in a range of indexes. We can do that fast, but it definitively is not free.

0reactions
lemirecommented, May 27, 2022

IIRC that iterator yields at container boundaries, but returns how many elements were filled.

Ah. Interesting.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Issues · RoaringBitmap/RoaringBitmap - GitHub
first() and last() methods like RoaringBitmap in Roaring64Bitmap enhancement ... Method advanceIfNeeded returns a count of skipped items enhancement.
Read more >
Skipped Items not listed in reports from IMAP batch migration
Migration details show a count in "ITEMS SKIPPED" column; Skipped item detail list for these mailboxes is empty; PowerShell also shows the ...
Read more >
Partner System Imported Authorization in DTS
Allocations, and select the Allocation Method. 3. Select Request Advance if needed. Note: Only available if you don't have a Government ...
Read more >
Measuring the performance impact of numeric column NULL ...
useDefaultValueForNull=false , and in the process do a deep dive into how ... that offers an advanceIfNeeded method that allows skipping the ...
Read more >
Shipping Policy Best Practices (With Examples) - Narvar
Place an emphasis on customer communication · Order confirmations · When items are scheduled to be shipped · When they have actually shipped...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found