`data:status`: errors when data not in cache
See original GitHub issueBug Report
Description
On initially cloning a repository all tracked paths are shown in three categories:
- Not in cache
- Committed modified
- Uncommitted deleted
There can also be missing data, for example in the vscode-dvc demo project: training_metrics
is missing. This is produced by DVCLive
and is listed under the plots key in the dvc.yaml
.
This breaks one of the workflows in the VS Code extension.
Reproduce
git clone https://github.com/iterative/vscode-dvc
cd vscode-dvc/demo
python3 -m virtualenv .env
source .env/bin/activate
pip install -r requirements.txt
dvc data status --show-json --with-dirs --granular --untracked --unchanged
{
"not_in_cache": [
"model.pt",
"misclassified.jpg",
"predictions.json"
],
"committed": {
"modified": [
"model.pt",
"misclassified.jpg",
"predictions.json"
]
},
"uncommitted": {
"deleted": [
"model.pt",
"misclassified.jpg",
"predictions.json"
]
}
}
Note: DVC may need to be updated to 2.15.0
in the requirements.txt
file.
Expected
Paths are returned in not in cache key only. All paths are returned.
Environment information
Output of dvc doctor
:
$ dvc doctor
DVC version: 2.15.0 (pip)
---------------------------------
Platform: Python 3.10.5 on macOS-12.2.1-arm64-arm-64bit
Supports:
webhdfs (fsspec = 2022.5.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.5.2),
https (aiohttp = 3.8.1, aiohttp-retry = 2.5.2),
s3 (s3fs = 2022.5.0, boto3 = 1.21.21)
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: https
Workspace directory: apfs on /dev/disk3s1s1
Repo: dvc (subdir), git
Additional Information (if any):
Also verified on dvc-2.15.1.dev13+g9ff18502
.
Issue Analytics
- State:
- Created a year ago
- Comments:13 (10 by maintainers)
Top Results From Across the Web
`data status`: throws unexpected error if any dvc.yaml in the ... - GitHub
I am fine with introducing the concept of partial results in dvc data status --json and exit with 2 , but I am...
Read more >Troubleshooting | Data Version Control - DVC
Failed to pull data from the cloud · Too many open files error · Unable to find credentials · Unable to connect ·...
Read more >Placeholder and Initial Data in React Query | TkDodo's blog
InitialData Since initialData is persisted in the cache, the refetch error is treated like any other background error. Our query will be in ......
Read more >AngularJS: How can I cache json data returned from $http call?
If you data is simple enough, my suggestion is to write your own cache that is checked before you use the angular $http...
Read more >Reading and writing data to the cache - Apollo GraphQL Docs
You can read and write data directly to the Apollo Client cache, without communicating with your GraphQL server. You can interact with data...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
For
committed changes
where there’s no cache, we probably can just look at the hashes and tell they are unchanged, and only reportnot in cache
, so we can avoidmodified
here.I’ll try to look into more scenarios where we can avoid the
modified
/deleted
stuff. I have been using this defintion ofnot in cache
for now:@mattseddon, I am working on it, hopefully by the end of this week. 🙂