Restructuring feedstock/team update for better rate limit
See original GitHub issueOur current strategy with team updates is running into issues. ( https://github.com/conda-forge/conda-forge.github.io/issues/125 ) ( https://github.com/conda-forge/conda-forge.github.io/issues/88 ) That much is clear. Now how might we fix it. Here are some thoughts.
Manage staged-recipes with a different bot/token.Done with PR ( https://github.com/conda-forge/staged-recipes/pull/927 ).Easy to do.Doesn’t block more merges.Saves reviewer time (otherwise spent handling the fallout of rate limits).Small savings so isn’t sufficient alone.Could handle team additions and user additions.Done with PR ( https://github.com/conda-forge/staged-recipes/pull/733 )Obviates needed for triggering feedstock/team update script run.Also done with PR ( https://github.com/conda-forge/staged-recipes/pull/733 )
More bot accounts/tokens.Skipping per discussion with @pelson.Allows some scaling by number of bots.Effectively bumps our rate limit.Second bot is most expensive. The rest are cheap.Leaves us vulnerable to the same problems in the future.
- Leverage the feedstock repo. ( related to @pelson’s suggestion )
- Already updating the feedstock repo.
- Only feedstocks that were added/changed get picked up. (couple order of magnitude improvement)
- Additional optimizations from diffing previous/current feedstock commits for name additions.
- Should actually catch everyone.
- Can always resort to the check and add everyone script.
Move to a time based non-CI (Heroku) approach as opposed to on demand. (self-throttling 😄)Done. See this ( https://github.com/conda-forge/conda-forge.github.io/issues/48#issuecomment-226571553 ).Saves time spent cloning everything. (cache)Reasonable for existing feedstocks.Can still be triggered as needed.Avoids script running when merging things here.Solves race condition problems. (debouncing requests)Kind of annoying for brand new feedstocks. (maintainers can’t fix problems right away)
- Alternatives to a full Travis CI sync.
- This sync is probably costing us dearly with our GitHub API token.
- It probably needs to check the settings of each repo and update.
- This means it is
O(N)
on repos present here, which is bad when our rate limit is fixed. - Need to be able to add a single repo to Travis CI. See issue ( https://github.com/travis-ci/travis-ci/issues/6320 ).
- Move the feedstock/team update script off the webpage repo. ( https://github.com/conda-forge/conda-forge.github.io/issues/214 )
- No spurious runs when merging unrelated changes (e.g. minutes, pinnings, docs, etc.).
Basically we should do 1 to simplify the life of reviewers at staged-recipes. Whether or not it amounts to much savings for the rate limit is irrelevant as it saves people’s time. Examples of wasted time of people there include reporting these rate limiting issues, communicating rate limiting issues to each other, and working out the backlog at staged-recipes. Also, by doing this we would then break the problem into two pieces. How do we handle adding maintainers to a new feedstock vs. how do we add people to existing feedstocks? This would also put us on good footing for addressing 4.6.
Doing 2 is pretty easy. Though I’m not sure we should do it as it is a bit of a crutch.
Doing 3 has clear benefits as it is doubtful that many feedstocks are changing. With diffing the number that actually add maintainers, this should be quite small. We should definitely do this, but it will be a bit of work.
Doing 4 is generally a net plus. Though we should figure out how to address 4.6. For instance, 1.5 would help this. In general, this nicely couples with 1 and addresses the underlying reason for triggering the feedstock updating script on merge at staged-recipes.
Thoughts? Feedback other suggestions?
Issue Analytics
- State:
- Created 7 years ago
- Comments:12 (11 by maintainers)
Top GitHub Comments
Going to close this for now. There is more work that can be done, but things seem to be more reliable now and less pressing.
Would it make sense to sprinkle in more calls to
print_rate_limiting_info(gh)
in thecreate_feedstock.py
script so that we can have a more fine grained view of where we are using the GitHub API? Perhaps before each subprocess.check_call?