[SIP-13] Proposal for Code Review Process
See original GitHub issueMotivation
The Superset community has a large number of contributors and stakeholders, which leads to many people trying to modify the codebase at the same time.
For project maintainer this introduces a lot of challenges:
- A large number of issues and pull requests, which require significant amount of time to triage, review and shepherd until merging. As of Oct 15, 2018, there are 864 open issues and 161 open PRs. Many of these are a few years old and remain in limbo.
- Keeping the code base healthy. It is great to receive many community contributions, but the overall project stability and code quality must come first. If the PR idea/work is interesting, but the quality does not match the standard or introduces risky changes without justified reasons, the project maintainers have a hard time handling these PRs. With the proposal for frequent release plan (SIP-12), master stability becomes even more important.
- Merging a PR that ends up breaking master branch. Old community PRs that are finally merged sometimes break the master branch (commonly CI) because they are not rebased against master and tested again before merging. The responsibility then falls to the maintainer to resolve the broken state.
For contributors, these are the common questions:
- Who will review my PR? Currently, it is unclear who should participate in the code review of each PR. Contributors sometimes manually tag github username of project maintainers for reviews. This does not work well for beginners who do not know anyone, perhaps tagging @mistercrunch. Even experienced contributors can have this issue when touching the code in different sections of the project. It is also a manual process.
- When will someone review my PR? There are several uncertainties around the code review process. Some are implied (e.g.,no review until the PR pass all CI tests, PRs with [WIP] in the title are considered work-in-progress and will not be reviewed, etc.). However, there are many PRs that are not ready for review, but do not have [WIP] and cause confusion.
- When will someone merge my PR? After PRs are approved, sometimes they are left in that state for a while. It is not explained when it will be merged. This left the contributors hanging.
- Which release will include my PR? Currently, it is unclear when will the change make it to an official release, leading to different organizations maintaining their own branches. Please also see a proposal for release plan in SIP-12 for more details.
To be more efficient at maintaining Superset and improving it in the long-term, we need more workflow clarity that will increase code quality and improve project stability. This proposal wishes to reduce operational workload for the project maintainers, set the right expectations for everyone and make the project keep growing with happy community.
Proposed Changes
[P.1] Project maintainers can request an issue to be created before a PR is reviewed.
A philosophy we would like to strong encourage is
“Before creating a PR, create an issue.”
We would like to update the contributing guidelines for Superset as follows.
Bug fixes: If you’re only fixing a small bug, it’s fine to submit a pull request right away but we highly recommend to file an issue detailing what you’re fixing. This is helpful in case we don’t accept that specific fix but want to keep track of the issue. Please keep in mind that the project maintainers reserve the rights to accept or reject incoming PRs, so it is better to separate the issue and the code to fix it from each other. In some cases, project maintainers may request you to create a separate issue from PR before proceeding.
Refactor: For small refactors, it can be a standalone PR itself detailing what you are refactoring and why. If there are concerns, project maintainers may request you to create a SIP for the PR before proceeding.
Feature/Large changes: If you intend to change the public API, or make any non-trivial changes to the implementation, we requires you to file a new issue as SIP (Superset Improvement Proposal). This lets us reach an agreement on your proposal before you put significant effort into it. You are welcome to submit a PR along with the SIP (sometimes necessary for demonstration), but we will not review/merge the code until the SIP is approved.
In general, small PRs are always easier to review than large PRs. The best practice is to break your work into smaller independent PRs and refer to the same issue. This will greatly reduce turnaround time.
Finally, never submit a PR that will put master branch in broken state.
[P.2] Create multiple new templates for new issues/PRs
Currently we only have an issue template but not a PR template.
To facilitate the issue and PR creation, we will provide templates for different scenarios. Github supports multiple templates which users can choose from a menu when creating.
- Creating issue templates
- Multiple issue and pull request templates
- Ansible has good examples of templates
An issue will be one of the following types:
- Question
- Bug
- Feature request
- SIP (Superset Improvement Proposal)
A PR will be one of the following types
- Bugfix - Fix bugs
- Feature - Add new features
- Refactor - Refactor code to improve code quality
- Test - Add new test cases
A PR should generally includes the following:
- Summary
- Before/after screenshots or animated GIF (when applicable)
- Test Plan
- Checkboxes
- Has associated issue(s):
- Changes UI
- Includes screenshot
- Requires DB Migration
- Introduces new feature or API
- Removes existing feature or API
[P.3] Pre-define reviewers with CODEOWNERS
Github allows us to list mandatory reviewers for each directory in a CODEOWNERS file. When a new PR is opened, the corresponding reviewers will be assigned to review that PR automatically. Mandatory reviewers do not have to be committers.
CODEOWNERS will be defined as a team. A team contains one or more people, such as (python-reviewers
, js-reviewers
, vis-reviewers
) See an example.
Note: CODEOWNERS requires changing github repo settings (via Apache) to enable protected branch option. Teams are defined in organization settings. Need to investigate if it has to be the same github organization (Apache) to define the teams. Fallback plan is using another github org or macro/alias.
[P.4] Define a protocol when a PR is in progress
Authoring
- Fill in all sections of the PR template.
- Add label “WIP” if not ready for review (WIP = work-in-progress)
- Add category label: bugfix, refactor, test or feature
- Remove label “WIP” and add “READY” when ready for review
- Changes to user interface require before/after screenshots, or GIF for interactions
- Recommended capture tools (Kapture, LICEcap, Skitch)
- If no screenshot is provided, the mandatory reviewers will mark the PR with “missing:screenshot’ label and will not review until screenshot is provided.
- Reviewers will not review the code until all CI tests are passed. Sometimes there can be flaky tests. You can close and open PR to re-run CI test. Please report if the issue persists and rebase your PR after the fix has been deployed to master.
- If the PR was not ready for review and inactive for > 30 days, we will close it due to inactivity. The author is welcome to re-open and update. Will set up a stale bot to do this.
Reviewing
- Triage and add labels: Here are some example labels. Please keep in mind that these are examples. The final list of labels and adding a bot to add them will become action items if this SIP is approved.
- Missing: screenshot, issue, test
- Language: Javascript, Python, Others
- Components: Dashboard, SQL Lab, Explore, Charts
- Use constructive tone when writing reviews.
- If there are changes required, state clearly what needs to be done before the PR can be approved.
Merging
- At least one mandatory reviewer is required for merging a PR.
- A PR with label “READY” and approved will be merged by a committer. The committer adds label(s) of release branches to go into, i.e.,
release/1.1
,release/1.2
. (See SIP-12) - After the PR is merged, close the corresponding issue(s).
Post-merge Responsibility
- Project maintainers may contact the PR author if new issues are introduced by the PR.
- Project maintainers may revert your changes if a critical issue is found, such as breaking master branch CI.
Note about labels
Non-committers cannot assign label by default. We will setup bots that allow contributors to add whitelisted labels via bot. User can type in comments such as
-label WIP
+label READY
If this does not work, we can fallback to adding [WIP]
or [READY]
to PR titles.
[P.5] New changes to CI
Here are a few proposed improvements to the CI
[P.5.1] Add mandatory reviewers hook
The mandatory reviewers are defined in CODEOWNERS file
[P.5.2] Enable Stale branch check.
Pros
- Prevents broken master when a stale branch is approved and merged.
- Parallelized merging of PRs.
Cons
- Serialized merging of PRs impies adding overhead (having to re-sync either via the UI or CLI) and rerun the CI.
- Improvements in CI performance would help reduce some of the burden.
Note: Require this change to the github repo settings
[P.5.3] Automatic checks for code coverage project/patch
Similar to scikit-learn, this enforces the contributing guidelines of unit testing. Commits pushed to master should not make the overall project coverage decrease by more than 1%:
This will be enabled, though initially as a non-required check as we iterate on the configuration settings.
Note: These CI changes require Apache admin changes to the Github repo.
Action items
Please comment and refer to [P.x.x] if you have any suggestion.
If this SIP is approved, here are the follow-up tasks.
- Update CONTRIBUTING.md
- Create new issues and PRs templates
- Create CODEOWNERS file, teams, and list mandatory reviewers
- Update CI configuration
- Setup bots for handling labels
- Setup a bot for closing stale issues and PR
- Triage the current PRs and issues.
References
Issue Analytics
- State:
- Created 5 years ago
- Reactions:2
- Comments:9 (8 by maintainers)
Top GitHub Comments
Reminder that part of the SIP process should be to notify the Apache Superset dev@ mailing list.
Although this seems to be stuck in an open vote, some aspects of the SIP have been completed whereas others likely outdated, hence we sense closing this SIP makes the most sense. @kristw if you feel otherwise please re-open the SIP with additional context.