question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

add: symlinks to `.dvc/cache/ha/sh` but creates `.dvc/cache/ha/sh.csv`

See original GitHub issue

Description

When adding files dvc add ny.csv, the files are correctly moved to the cache as.dvc/cache/ae/78daccbcbb27cd66956368c36bdd8f.csv. However, the created symlink points to .dvc/cache/ae/78daccbcbb27cd66956368c36bdd8f (missing the extension). This results in a dead link.

Reproduce

  1. Enable symlinks dvc config cache.type symlink
  2. Create a dvc repo within a Google Drive Folder on a Mac M1 (Idk if this is critical) dvc init
  3. touch foo.csv && dvc add foo.csv
  4. Confirm that cache contains a file with a hash but points to a file without it

Expected

I expect the created symlink to not be dead on creation.

Environment information

DVC version: 2.10.2 (brew)
---------------------------------
Platform: Python 3.9.12 on macOS-12.3.1-arm64-arm-64bit
Supports:
        azure (adlfs = 2022.4.0, knack = 0.9.0, azure-identity = 1.10.0),
        gdrive (pydrive2 = 1.10.1),
        gs (gcsfs = 2022.3.0),
        webhdfs (fsspec = 2022.3.0),
        http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
        https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
        s3 (s3fs = 2022.3.0, boto3 = 1.21.21),
        ssh (sshfs = 2022.3.1),
        oss (ossfs = 2021.8.0),
        webdav (webdav4 = 0.9.7),
        webdavs (webdav4 = 0.9.7)

Additional Information (if any):

I have been unable to reproduce this in a test example. I suspect the issue is with Google Drive more than dvc, but I figured I’d drop a line.

Thanks for a fantastic tool!

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:2
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

3reactions
tirkarthicommented, May 23, 2022

I am unable to reproduce this locally on Linux using local filesystem. Docs in creating files through API indicate that google drive might add an extension based on file’s mime-type.

https://developers.google.com/drive/api/v3/reference/files/create

Subsequent GET requests include the read-only fileExtension property populated with the extension originally specified in the name property. When a Google Drive user requests to download a file, or when the file is downloaded through the sync client, Drive builds a full filename (with extension) based on the name. In cases where the extension is missing, Google Drive attempts to determine the extension based on the file’s MIME type.

See also : https://support.google.com/drive/thread/61927468/backup-snyc-adds-unwanted-file-extensions?hl=en

1reaction
awadell1commented, May 26, 2022

Gotcha, thank you!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Using read.csv with a symlinked file - Stack Overflow
I decided to create a symlink to the large file and want to use read.csv to read in the file. Folder structure. project1/data/source-file.csv...
Read more >
Symlink Tutorial in Linux – How to Create and Remove a ...
A symlink (also called a symbolic link) is a type of file in Linux that points to another file or a folder on...
Read more >
How to Create Symbolic Links: A Comprehensive Guide
This command creates the symlink in the current directory. Change symlink_name to your desired symlink name and /path/to/file to the actual ...
Read more >
How to create symbolic links to all files (class of files) in a ...
In the 2nd form, create a link to TARGET in the current directory. In the 3rd and 4th forms, create links to each...
Read more >
How to Create Linux Symlinks (Symbolic Links) - Hostinger
Upgrade your Linux Terminal arsenal with symbolic links. Here's everything you need to know to start using them in your workflow.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found