question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

status: takes too long to get status

See original GitHub issue

Bug Report

Description

I have dvc setup in the root of my project folder, which is at

C:\Users\raylu\Documents\Github\audit-engine

the stage file is established in

resources\WI_Ozaukee_20201103\dvc\precheck\dvc.yaml

I issue this command:

dvc status -R -v -v -v --show-json  resources\WI_Ozaukee_20201103\dvc

And I expect that it will walk the subtree under

C:\Users\raylu\Documents\Github\audit-engine\resources\WI_Ozaukee_20201103\dvc

to look for dvc.yaml stage files. Instead, it appears to walk the full tree below

C:\Users\raylu\Documents\Github\audit-engine

and this takes 75 seconds (there is 112 GB of data). But this is just a hunch. We temporarily moved the .dvc folder to inside the folder

C:\Users\raylu\Documents\Github\audit-engine\resources\WI_Ozaukee_20201103\dvc

and it takes only 5.6 seconds (which is still pretty long). This should probably take only a second or two, because getting the etags from the three s3 files is very fast and it needs only to find one stage file. It seems something is wrong here.

Reproduce

To reproduce this, dvc must be configured with no scm, no remote, no cache and use -R in status, so it can find the dvc.yaml stage files. We have only one.

Expected

See above.

Environment information

Output of dvc doctor:

$ dvc doctor

DVC version: 2.6.4 (pip)
---------------------------------
Platform: Python 3.7.6 on Windows-10-10.0.19041-SP0
Supports:
        http (requests = 2.24.0),
        https (requests = 2.24.0),
        s3 (s3fs = 2021.8.0, boto3 = 1.17.106)

Additional Information (if any): I will attach the profile dump and plot.

Profile Dump

https://cdn.discordapp.com/attachments/882823608949411850/884465153716920380/dump.prof

https://cdn.discordapp.com/attachments/882823608949411850/884467942111203348/image_output.png

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:16 (4 by maintainers)

github_iconTop GitHub Comments

4reactions
pmrowlacommented, Sep 7, 2021

To clarify, the reason for the current (stage/pipeline collection) behavior is that for dvc status <target>, <target> could be either a directory containing a dvc.yaml file, or the output for some dvc.yaml file outside of <target>.

So if I had a repo with path/dvc.yaml containing:

stages:
  foo:
    outs:
        path/to/dir

Given the command dvc status path/to/dir, DVC still has to search the parent directories path/, path/to/ for the correct dvc.yaml file w/the output path/to/dir instead of only limiting the search to path/to/dir itself.

But I think the issue here is that when using the -R/--recursive <target>, the user is explicitly telling DVC to look recursively for dvc.yaml and .dvc files inside the target path (meaning it implies that <target> is not a stage output). So we could potentially skip the parent directory search when using -R.

2reactions
daavoocommented, Sep 8, 2021

Would it help to decouple pipeline status (dvc stage status) from data status (dvc status), similar to how dvc add / dvc stage add were decoupled?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Adjustment of Status Timeline, Fees and Requirements
The time it takes to get a marriage green card from when your K-1 is approved is 12-22 months, but it could be...
Read more >
Why Is Your Case Taking So Long to Process? Our Guide to ...
Regardless of whether you are filing your application for the first time with USCIS or whether you have had your case pending for...
Read more >
Processing Times - USCIS Case Status
Check Case Processing Times. Select your form, form category, and the office that is processing your case. Refer to your receipt notice to...
Read more >
Why Is My Immigration Case Taking So Long? - AllLaw
If you have already gotten a receipt number for your own petition or application, you can check its status on the USCIS website....
Read more >
Why does USCIS take so long to update a status? - Quora
Yes, the case status is riddled with errors and not updated. The dashboard and case status online show different things. USCIS doesn't care....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found