question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

add: incredibly slow

See original GitHub issue

Bug Report

add: incredibly slow

Description

I have a dataset that consists of videos (large files) that sit along some metadata in json files. If the metadata (json files) are updated, if the dataset directory is re-added, it seems to be rehashing everything again?

dataset

  • A a_1.avi a_2.avi a.json
  • B b_1.avi b_2.avi b.json …

Updating a handful of json files, dvc add of dataset takes ~1 hour to re-compute the md5 hashes? Which doesn’t make sense given all the large files are untouched, and already in a local dvc cache?

Reproduce

  1. Existing DVC dataset directory with several large files
  2. update metatadata json files (along side the large files)
  3. dvc add dataset

Expected

would expect it to take just a few minutes; however, its taking an hour…

Environment information

Output of dvc doctor:

$ dvc doctor

DVC version: 2.8.2 (pip)
---------------------------------
Platform: Python 3.8.0 on Linux-3.10.0-1160.45.1.el7.x86_64-x86_64-with-glibc2.27
Supports:
	webhdfs (fsspec = 2021.10.1),
	http (aiohttp = 3.8.0, aiohttp-retry = 2.4.6),
	https (aiohttp = 3.8.0, aiohttp-retry = 2.4.6),
	s3 (s3fs = 2021.10.1, boto3 = 1.17.106)
Cache types: hardlink, symlink
Cache directory: nfs on LEB1MLNAS.comany.com:/leb1mlnas_projects
Caches: local
Remotes: s3
Workspace directory: nfs on LEB1MLNAS.company.com:/leb1mlnas_projects
Repo: dvc, git

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:8 (1 by maintainers)

github_iconTop GitHub Comments

3reactions
wdixoncommented, Nov 17, 2021

We can close, thank you for the work and the response!

1reaction
Mastergalencommented, Nov 16, 2021

I copied the directory to the local file system and ran the same dvc add command. This time it only took 1 minute.

Seems like it’s something to do with nfs?

EDIT: Looks like the team is already of this issue #5562

Read more comments on GitHub >

github_iconTop Results From Across the Web

Do You Have ADHD and Feel Like You Often Respond Too ...
Slow processing is common for ADHD adults. And it has nothing to do with your intellect! Check out these workarounds you can use...
Read more >
Is it ADHD or Slow Processing Speed?
In some people, slow processing speed (taking longer than others to complete tasks or thoughts) is an indicator of ADHD.
Read more >
Google Chrome is Very Slow: How I Fixed it! (5+ Methods ...
Google Chrome is Very Slow: How I Fixed it! (5+ Methods Added) · Disruptive internet connection. · Background apps and digital downloads.
Read more >
Slow Google Chrome Fixes - Driver Support
Check add-ons. Too many add-ons will make it slow. Each one opens a virtual tab that runs all the time. Very slow with...
Read more >
Why Your Computer Is So Slow and How To Fix It - Crucial.com
1. Identify programs that slow down your computer. 2. Check your web browser and internet connection. 3. Defragment your hard disk drive
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found