Improve OSSIndex Analyzer batching and resilience
See original GitHub issueCurrent Behavior:
OssIndexAnalysisTask paginate its request up to 100 purls per calls. However, there are rooms for improvements :
- The allowed maximum number of purl per call is now 128 instead of 100
- Pruning component for which the component analysis cache is still valid is not done before pagination, this can result in inefficient batching like in the example below where we could have done 7 requests instead of 16 requests to OSSIndex (5 requests with 128 components per page)
2022-10-08 19:20:12,563 INFO [OssIndexAnalysisTask] Analyzing 47 component(s)
2022-10-08 19:20:17,576 INFO [OssIndexAnalysisTask] Analyzing 91 component(s)
2022-10-08 19:20:19,946 INFO [OssIndexAnalysisTask] Analyzing 74 component(s)
2022-10-08 19:20:27,405 INFO [OssIndexAnalysisTask] Analyzing 71 component(s)
2022-10-08 19:20:30,803 INFO [OssIndexAnalysisTask] Analyzing 48 component(s)
2022-10-08 19:20:34,584 INFO [OssIndexAnalysisTask] Analyzing 86 component(s)
2022-10-08 19:20:36,513 INFO [OssIndexAnalysisTask] Analyzing 13 component(s)
2022-10-08 19:20:38,307 INFO [OssIndexAnalysisTask] Analyzing 2 component(s)
2022-10-08 19:20:41,403 INFO [OssIndexAnalysisTask] Analyzing 2 component(s)
2022-10-08 19:20:45,016 INFO [OssIndexAnalysisTask] Analyzing 2 component(s)
2022-10-08 19:20:46,315 INFO [OssIndexAnalysisTask] Analyzing 1 component(s)
2022-10-08 19:20:57,674 INFO [OssIndexAnalysisTask] Analyzing 41 component(s)
2022-10-08 19:20:59,069 INFO [OssIndexAnalysisTask] Analyzing 46 component(s)
2022-10-08 19:21:04,412 INFO [OssIndexAnalysisTask] Analyzing 42 component(s)
2022-10-08 19:21:06,030 INFO [OssIndexAnalysisTask] Analyzing 41 component(s)
2022-10-08 19:21:07,461 INFO [OssIndexAnalysisTask] Analyzing 31 component(s)
- There are no retry (with exponential beckoff) or throttling mechanisms implemented. In case of any HTTP error (like 429 for rate threshold breach), only a log and notification are issued.
Proposed Behavior:
Implement solutions for each of the three points mentioned above.
Issue Analytics
- State:
- Created a year ago
- Reactions:3
- Comments:12 (11 by maintainers)
Top Results From Across the Web
syalioune - GitHub
Dependency-Track is an intelligent Component Analysis platform that allows ... Feature: Improving OSS Index Analyzer batching and resilience mechanism Nov 1.
Read more >Sonatype OSS Index Analyzer - dependency-check
OWASP dependency-check includes an analyzer that will detect software packages and checks the Sonatype OSS Index if the package contains vulnerability ...
Read more >Three Ways: A Principle-based DevOps Framework
Surfacing and sharing improvements at all levels will help enable a “bubble up” culture of continuous improvement. Injecting resilience ...
Read more >Changelog - Sourcegraph docs
While running server-side unlocks a new and improved UI experience, ... Viewing or previewing a batch change is now more resilient when transient...
Read more >ads.cert Open Source Software Design Doc - IAB Tech Lab
service designed to improve the digital experience for consumers, publishers, ... the critical path of proper advertising delivery, so resilience must be ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I am wondering if this actually happens a lot? Wouldn’t you nede quite a lot of things to align to have “lots of duplicate calls” happen?
You would need:
Sounds like this would only happen occasionally and might not be worth adding synchronization for? Synchronization might make things faster in the happy flow, but once you get errors/timeous/hiccups it can be painful to debug/solve.
Definitely something we can improve, thanks for reporting!