question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

used_objs: add progress bar

See original GitHub issue

Bug Report

Description

My dvc push command was seemingly hanging, so I ran it with --verbose to see where is it hanging, and it “sat” quietly for ~40 secs without any output after the DEBUG: Check for update is enabled. line:

2021-07-02 14:01:42,969 DEBUG: Check for update is enabled.
2021-07-02 14:02:24,248 DEBUG: Preparing to upload data to 's3://xxxx/dvc/xxxx'
           ^^^^^^^^ 40 sec delay
2021-07-02 14:02:24,248 DEBUG: Preparing to collect status from s3://xxxx/dvc/xxxx
2021-07-02 14:02:24,449 DEBUG: Collecting information from local cache...
2021-07-02 14:02:35,735 DEBUG: Collecting information from remote cache...
2021-07-02 14:02:35,736 DEBUG: Querying 1 hashes via object_exists
2021-07-02 14:02:36,228 DEBUG: Indexing new .dir '5389a8fa94f8ec85a4051b3a3523c744.dir' with '629922' nested files
2021-07-02 14:02:43,280 DEBUG: Matched '0' indexed hashes
2021-07-02 14:02:43,616 DEBUG: `list_hashes()` returned max '122.0703125' hashes, skipping remaining results
2021-07-02 14:02:43,617 DEBUG: Estimated remote size: 503808 files
2021-07-02 14:02:43,617 DEBUG: Large remote ('15' hashes < '2519.04' traverse weight), using object_exists for remaining hashes
2021-07-02 14:02:43,617 DEBUG: Querying 15 hashes via object_exists
2021-07-02 14:02:45,074 WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: ml/tools/xxxx.json, md5: 6589f200ea392b8fc6bf0a909fd9ae51
2021-07-02 14:02:45,288 DEBUG: Uploading '.dvc/cache/d5/b5781b20bc08b688a72e622fb770e1' to 's3://xxxx/dvc/xxxx/d5/b5781b20bc08b688a72e622fb770e1'
1 file pushed
2021-07-02 14:02:47,379 DEBUG: Analytics is enabled.
2021-07-02 14:02:47,507 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/zb/vdy_45jx5txf0yzdy0lvrxf40000gq/T/tmpyr2qmrti']'
2021-07-02 14:02:47,516 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/zb/vdy_45jx5txf0yzdy0lvrxf40000gq/T/tmpyr2qmrti']'

Running it with the -vv flag doesn’t really help:

2021-07-02 14:39:51,927 TRACE: Namespace(cprofile=False, cprofile_dump=None, pdb=False, instrument=False, instrument_open=False, quiet=0, verbose=2, version=None, cd='.', cmd='push', jobs=None, targets=[], remote=None, all_branches=False, all_tags=False, all_commits=False, with_deps=False, recursive=False, run_cache=False, glob=False, func=<class 'dvc.command.data_sync.CmdDataPush'>)
2021-07-02 14:39:52,096 DEBUG: Check for update is enabled.
2021-07-02 14:39:52,376 TRACE: Assuming '/xxxx/.dvc/cache/53/89a8fa94f8ec85a4051b3a3523c744.dir' is unchanged since it is read-only
2021-07-02 14:39:56,515 TRACE: Assuming '/xxxx/.dvc/cache/53/89a8fa94f8ec85a4051b3a3523c744.dir' is unchanged since it is read-only
2021-07-02 14:40:31,364 DEBUG: Preparing to upload data to 's3://xxxx/dvc/xxxx'
           ^^^^^^^^ 36 sec delay
2021-07-02 14:40:31,364 DEBUG: Preparing to collect status from s3://xxxx/dvc/xxxx
2021-07-02 14:40:31,570 DEBUG: Collecting information from local cache...

Reproduce

It reproduces for me locally, but I’m not sure what triggers this.

Expected

--verbose flag gives user enough context to not wonder what’s happening.

Environment information

Output of dvc doctor:

DVC version: 2.4.3 (brew)
---------------------------------
Platform: Python 3.9.5 on macOS-11.4-x86_64-i386-64bit
Supports: azure, gdrive, gs, http, https, s3, ssh, oss, webdav, webdavs
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s5s1
Caches: local
Remotes: s3
Workspace directory: apfs on /dev/disk1s5s1
Repo: dvc, git

Additional Information (if any):

cprofile dump: dvc-push-dump.prof.zip

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:14 (9 by maintainers)

github_iconTop GitHub Comments

2reactions
agurtovoycommented, Jul 6, 2021

@efiop Tested 2.5.0, can confirm that the 40 sec delay is gone and the overall performance is much better now! 🎉

I did notice that neither the --verbose nor -vv produce the log output that they used to, though; instead, I just get a “Querying cache” progress bar, following by this permanent terminal output:

$ dvc --verbose push
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: ml/tools/xxxx.json, md5: 6589f200ea392b8fc6bf0a909fd9ae51
Everything is up to date.

Not sure if this change in behavior was intentional.

1reaction
agurtovoycommented, Jul 6, 2021

@efiop Haha, you’re right, that was it, putting the flag after the command did it. I’m surprised I got it right the first time around, lol.

Would be happy to connect, just shoot me an invite to my Github username +@acm.org.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How TO JS Progress Bar - W3Schools
Learn how to create a progress bar using JavaScript. Run. Creating a Progress Bar. Step 1) Add HTML: Example.
Read more >
How to Add a Progress Bar to Your Videos in Descript
In this video, I'll show you how to Add Progress to Your Videos in Descript. Try Descript Now https://freelancerinsights.com/GetDescript ...
Read more >
Progress - Bootstrap
Documentation and examples for using Bootstrap custom progress bars featuring support for stacked bars, animated backgrounds, and text labels.
Read more >
Progress Bar - Typer - tiangolo
Typer progressbar ¶. If you can, you should use Rich as explained above, it has more features, it's more advanced, and can display...
Read more >
Python Progress Bar - Stack Overflow
With tqdm ( conda install tqdm or pip install tqdm ) you can add a progress meter to your loops in a second:...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found