question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Not able to push data of dependencies to the remote

See original GitHub issue

Bug Report

Description

I’m not able to push data of dependencies in the dvc.yaml to the remote.

Reproduce

…/dvc.yaml Screenshot 2021-06-30 at 15 04 37

$ dvc repro
$ dcv add ../../data/my_data.csv
$ dvc push ../../data/my_data.csv

Error: failed to push data to the cloud - ‘…/…/data/my_data.csv’ does not exist as an output or a stage name in ‘dvc.yaml’: Stage ‘…/…/data/my_data.csv’ not found inside ‘dvc.yaml’ file

Expected

my_data.csv is uploaded to the cloud successfully.

Environment information

  • dvc 2.4.3

Output of dvc doctor:

DVC version: 2.4.3 (conda)
---------------------------------
Platform: Python 3.8.10 on macOS-10.15.3-x86_64-i386-64bit
Supports: http, https
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s5
Caches: local
Remotes: local
Workspace directory: apfs on /dev/disk1s5
Repo: dvc, git

Additional Information (if any):

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:17 (6 by maintainers)

github_iconTop GitHub Comments

4reactions
pmrowlacommented, Jul 7, 2021

@Christoph-1 using the rules I suggested, my_data.csv will be ignored by the first rule data/**.

The subdirectory exclusion !data/**/ only applies to the subdirectory paths (which end in a trailing slash), and essentially just forces git to traverse into subdirectories (so that it can see the .dvc files. All files inside subdirectories will still be ignored due to the first rule.

data/folder/my_data.csv does not match !data/**/ since my_data.csv is not a directory.

So the way the rules work together is:

# ignore every file inside data/
data/**

# force traversal into subdirectories within data/ that would normally be skipped entirely due to the first rule
!data/**/

# un-ignore .dvc files inside data/
!data/**/*.dvc

Another way to think about it would be that these rules are equivalent to the following for data/folder/:

# ignore the contents of data (non-recursive)
data/*

# un-ignore data/folder/
!data/folder/

# ignore the contents of data/folder/ (non-recursive)
data/folder/*

# un-ignore .dvc files within data/folder/
!data/folder/*.dvc

# ... continue repeating this pattern for subdirs inside data/folder/

You can verify this behavior yourself using git check-ignore

$ tree
.
└── data
    ├── foo
    ├── foo.dvc
    └── subdir
        ├── bar
        └── bar.dvc

2 directories, 4 files

$ cat .gitignore
/data/**
!/data/**/
!**/*.dvc

$ git check-ignore data/foo data/foo.dvc data/subdir/bar data/subdir/bar.dvc
data/foo
data/subdir/bar

You can see that only .dvc files are excluded by these rules. My data file paths (foo and bar) remain ignored by the first rule.

3reactions
pmrowlacommented, Jul 6, 2021

@Christoph-1 to properly exclude your .dvc files you will need something like

data/**
!data/**/
!data/**/*.dvc

The issue is that git will not traverse into subdirectories of an ignored dir unless the subdirectory itself is also explicitly excluded with a ! rule. So in your example, git won’t traverse into data/folder at all, since it is ignored by data/*, and the !data/folder/my_data.csv.dvc exclusion will never be considered.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Can't push to remote branch, cannot be resolved to branch
Unable to resolve. Ran this command: git push --all -u. This got my Feature/Name branch to github, but still ...
Read more >
How to Fix 'failed to push some refs to' Git Errors - Komodor
If you get a failed to push some refs to error, the main thing to do is git pull to bring your local...
Read more >
push/pull: missing sshfs dependency on macOS #6629 - GitHub
ERROR: failed to push data to the cloud - URL 'ssh://' is supported but requires these missing dependencies: ['sshfs']. Please report this bug ......
Read more >
Troubleshooting | Data Version Control - DVC
The most common cause is changes pushed to Git without the corresponding data being uploaded to the DVC remote. Make sure to dvc...
Read more >
Force maven to fetch dependencies from remote - Seralahthan
Here the build failure occurs as maven tries to build from the partially fetched dependency cached in the local repository. Maven fetches updates...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found