Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Performance regression on LatLonPoint#newPolygonQuery

See original GitHub issue

Description

I just notice a big performance regression on polygon queries using LatLonPoint field in lucene geo benchmarks:

I checked and the regression was introduced by this change: https://github.com/apache/lucene/pull/1017.

My suspicion is that before this change, SpatialQuery was calling the method #getSpatialVisitor() once for the whole index but in the new version is calling it once per segment. This method might be expensive for LatLonPoint queries, threfore the regression.

@nknize FYI

Version and environment details

No response

Issue Analytics

State:
Created a year ago
Comments:5 (5 by maintainers)

Top GitHub Comments

4reactions

iverasecommented, Sep 28, 2022

Fix seems to bring performance back to previous levels:

1reaction

nknizecommented, Sep 28, 2022

What’s annoying is how incredibly trappy this override logic is. That a method call literally moving from createWeight to getScorerSupplier results in a 72.2% regression even slipped by me before merge doesn’t bode well for new committers. And then it sat in regression until an entire company was interested in releasing.

I wonder if we can do better? Like maybe figure out better guardrails in these methods? Perhaps by something as simple as a rename (e.g., getScorerSupplierPerSegment) to signal one happens per segment? This isn’t the first and will certainly not be the last time an expensive operation accidentally slips to a critical path. Any other ideas how to lower the bar here for new committers?