Improve Performance of Fetching Descriptor file via TRS
See original GitHub issueIs your feature request related to a problem? Please describe. The performance can be a little slow, at least locally, to fetch a single descriptor file.
Fetching one file from one particular workflow version results in 241 JDBC statements. This workflow has 572 versions:
curl -s -X GET "http://localhost:8080/api/ga4gh/v1/tools/%23workflow%2Fgithub.com%2Fbroadinstitute%2Fgatk%2Fcnv_somatic_pair_workflow/versions" -H "accept: application/json" | jq '. | length'
572
Describe the solution you’d like
The current code loads up an entry and fetches all of its versions. It then looks in Java for the matching version to find the correct source file. That means it’s fetching all the versions (572 in my example above) only to discard all but one.
Instead, the code could just fetch the version or maybe even the source files directly from the DB, although it’s slightly more complicated than that; see surrounding code.
Describe alternatives you’ve considered Leave it. Seems to perform much better in a deployed env than in my local setup.
Additional context This is one of our heavily hit endpoints, when Terra is executing a workflow across many containers. On workflows without many versions it’s very fast, the one the one in my example it takes a couple of seconds, whereas its under 100ms for simple descriptors.
┆Issue is synchronized with this Jira Story ┆fixVersions: Dockstore 1.13 ┆friendlyId: DOCK-1921 ┆sprint: SEAB 83- Luca ┆taskType: Story
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:6 (3 by maintainers)
Top GitHub Comments
➤ Steve Von Worley commented:
Denis instructed that we should timebox this ticket to two hours for 1.12. I inspected the code and concluded that it would likely take me longer to complete the ticket: understand the code, modify it, test it, create/manage a PR, etc. So, I have changed the fix version to 1.13 and unassigned.
bad unito bot