Better support SCM functionality
See original GitHub issueFrom #318 the basic user cases are as follows:
Proposal
SCM view
- View all directories and files that have changed when compared to HEAD or cache (Visibility / Situational awareness)
- Checkout / commit and files or directories that have changed (Actions) - exact UX TBD.
- Checkout / commit / push or pull the entire repository (Actions)
Statuses that we currently provide in the extension
Status | SCM View | Decorations Provided ** | Sourced from | Notes |
---|---|---|---|---|
added | Y | Y | diff + list |
|
deleted | Y | Y | diff + list |
|
modified | Y | Y | diff + list + status |
|
notInCache | Y | Y | diff + list |
|
renamed | Y | Y | diff + list |
|
stageModified | Y | Y | diff + list + status |
For a detailed explanation of modified vs stageModified see https://github.com/iterative/vscode-dvc/issues/318#issuecomment-845800460 |
untracked | Y | Y | git |
this is untracked with respect to both git and dvc. We show these files because the user may want to dvc add them. |
tracked | Y | Y | list |
we decorate tracked because they are generally “git ignored” which will give them a “greyed out” decoration |
** Where possible we match the git extension’s decorations because we are trying to make the extension feel as native as possible. Our SCM integration is designed to show the user the state of the workspace with respect to the most recent commit.
Current approach (parallel CLI Commands)
name | command | reason |
---|---|---|
list |
dvc list . --dvc-only -R --show-json |
provides a list of all tracked files that we use for both decoration and SCM purposes. In the SCM view all files that we show must be tracked by DVC. We do this because we end up with untracked but modified (duplicates) items in the tree from diff if we do not |
diff |
dvc diff --show-json |
we map the output of diff directly to the list output to set all added , deleted , renamed , notInCache . We use it in combination with the status output to determine the difference between modified and “stage modified” |
status |
dvc status --show-json |
only used to determine the difference between modified and “stage modified” |
We currently try to run all three of the above commands in parallel. If any of the commands fail then we will retry all three until they have all completed without error. We do this to best mitigate stale information ending up in the extension.
Issues with the current approach
- It’s still slow (#608)
- We are unsure of the what the actual UI / UX should be (#609)
- A lot of data is sent between the CLI and extension that is unused (example: in
get-started-experiments
after first running an experiment the output ofdiff
contains ~80k “added” files, none of these files are tracked by dvc so we filter all of the records out) - We have issues running multiple commands in parallel (https://github.com/iterative/vscode-dvc/issues/767#issuecomment-910862443) <- this is particularly important because it means we cannot currently run the extension against get-started-experiments
Options for mitigation
# | option | pros | cons |
---|---|---|---|
1 | Run commands sequentially | locks should no longer be an issue | even slower |
2 | Only rerun failed commands | also mitigates lock issue | involves more complicated logic, possibility of stale data |
3 | Make all 3 commands lockless | allows us to continue to run all commands in parallel | involves work from the CLI team and is only an interim solution, complicates internal of DVC |
4 | Combine commands into single command that the integration can run | limits the amount of data needing to be transferred between the cli and extension, should be faster, cuts out grouped retry logic | more effort required, unsure as to benefit to general users |
5 | Replace CLI calls with event driven architecture | eliminates the need to call the CLI, could serve multiple clients | requires even more work and is not a short or even medium term solution |
6 | Make commands “lightweight” (add --dvc-only ) |
would limit the amount of data being passed and could speed things up | unsure as to the benefit to general users, still requires effort, could still run into lock issues |
My preference would be to start work on 4 as it would actually help us move towards 5.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:10 (10 by maintainers)
Top Results From Across the Web
The 5 Functions of Supply Chain Management - PlanetTogether
Good supply chain management ensures that you maintain a balance between demand and supply. In order to reduce waste, increase profits, ...
Read more >8 essential features of an effective supply chain management ...
We've identified eight features essential to supply chain management software—ones that can help organizations create a solid digital supply ...
Read more >8 Key Benefits of Effective Supply Chain Management
From improving accuracy to keeping up with demand, 6 River Systems shares why supply chain management is important.
Read more >Features Of Supply Chain Management | SCM Requirements
SCM feature helps users more capably direct what happens inside the warehouse. These systems help deal with all parts of the equation, including...
Read more >The Best Supply Chain Management Software & Tools
These features help you connect your data with those of your partners: suppliers, factories, warehouses, retailers, and transportation ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Having discussed with @efiop and @dberenbaum in the 21/09/14 cross team meeting we came to the following conclusion for next steps:
status
(to match up withgit
).--dvc-only
).dvc stage status
).Points to discuss:
dvc status
be migrated to a separate command (e.gdvc stage status
) or should there be a flag added to condense the information to what we need?git
handles this is shown above (https://github.com/iterative/vscode-dvc/issues/772#issuecomment-917760251)I will start a document now in notion here. We can continue the discussion there.
Thanks, @mattseddon! Looks great.
We actually have a proposal template in Notion. I didn’t want to ask you to do that much work, but you basically filled the template already in the doc you created, so I transferred your text into a proposal in https://www.notion.so/iterative/Consolidate-repo-status-ed3cd60f706f4fcaba1d3f3cac1498e9.
I’d like to flesh it out with some DVC requirements since this work should also be about improving the user experience within DVC. Take a look and let me know if/when it’s okay to start adding on to the proposal.