question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItΒ collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

dvc plots show: paths relative to `dvc.yaml` instead of `$PWD` in version 2.12.1 and 2.13.0

See original GitHub issue

Bug Report

Description

In the new release 2.13.0, dvc plots show (or diff) does not work with a setup when the dvc.yaml file is not in the root directory and when the data (for plots) are on a different path relative to the root.

For example with a repo structure like this

.
β”œβ”€β”€ data
β”œβ”€β”€ dvc_plots
β”œβ”€β”€ modules
β”œβ”€β”€ notebooks
β”œβ”€β”€ pipelines

where

pipelines
β”œβ”€β”€ segment_X
β”‚   └── classification
β”‚       β”œβ”€β”€ product_A
β”‚       β”‚   β”œβ”€β”€ dvc.lock
β”‚       β”‚   β”œβ”€β”€ dvc.yaml
β”‚       β”‚   └── params.yaml
β”‚       β”œβ”€β”€ product_B
β”‚       β”‚   β”œβ”€β”€ dvc.lock
β”‚       β”‚   β”œβ”€β”€ dvc.yaml
β”‚       β”‚   └── params.yaml

and

data
β”œβ”€β”€ segment_X
β”‚   β”œβ”€β”€ classification
β”‚   β”‚   β”œβ”€β”€ product_A
β”‚   β”‚   β”‚   └── precision_recall_curve.csv
β”‚   β”‚   β”œβ”€β”€ product_B
β”‚   β”‚   β”‚   └── precision_recall_curve.csv

where in each dvc.yaml file and each stage we have

wdir: ../../../..

(i.e. the working directory is the root of the repo)

you get the following warning when calling dvc plots show

WARNING: 'pipelines/segment_X/classification/product_A/data/segment_X/classification/product_A/precision_recall_curve.csv' was not found in current workspace. 

and similarly with other pipelines and plots. The issue is clearly that dvc seems to use the path to the corresponding dvc.yaml as the working directory (as the path above is indeed not in the workspace since the data directory is in the root directory)

Reproduce

  1. Create a pipeline which is structured as above with dvc.yaml in a different directory than the root and outputs in yet another directory.
  2. add some plots to a stage in dvc.yaml, the cause might also theoretically come from templating, so try something like
stages:
  plot:
    wdir: ../../../..
    cmd: >-
      python modules/evaluate.py
      --params=${paths.params_file}
    deps: ...
    params:
      - ${paths.params_file}:
          - paths
    metrics: ...
    plots:
      - ${paths.precision_recall_curve}:
          title: "Precision recall curve"
          x: recall
          y: precision

where the params.yaml should be in the same directory as the dvc.yaml and contain the following:

paths:
  params_file: pipelines/segment_X/classification/product_A/params.yaml
  precision_recall_curve: data/segment_X/classification/product_A/precision_recall_curve.csv

  1. Call dvc repro to create the precision_recall_curve.csv in the first place
  2. Call dvc plots show or dvc plots diff

Expected

The behaviour of dvc up till 2.12.0 where the plot paths are found correctly and relative to the $PWD instead of the directory where the coresponding dvc.yaml is located.

Environment information

Output of dvc doctor:

$ dvc doctor
DVC version: 2.13.0 (pip)
---------------------------------
Platform: Python 3.10.5 on Linux-5.18.7-1-MANJARO-x86_64-with-glibc2.35
Supports:
        azure (adlfs = 2022.4.0, knack = 0.9.0, azure-identity = 1.10.0),
        webhdfs (fsspec = 2022.5.0),
        http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
        https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6)
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/nvme0n1p2
Caches: local
Remotes: azure
Workspace directory: ext4 on /dev/nvme0n1p2
Repo: dvc, git

Additional Information (if any): The new version of dvc has two new dependencies:

  • dvc-data-0.0.23
  • dvc-objects-0.0.23

I am not sure if either of these two could be the cause, but they were both bumped from version 0.0.16 in dvc 2.13.0

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
paredcommented, Jul 26, 2022

Hi @tibor-mach! Should be fixed in next release

0reactions
tibor-machcommented, Jul 26, 2022

@pared Hi, how’s the progress on this one?

Read more comments on GitHub >

github_iconTop Results From Across the Web

plots show | Data Version Control - DVC
This command provides a quick way to visualize certain data such as loss functions, AUC curves, confusion matrices, etc. All plots defined in...
Read more >
Further plots development Β· Discussion #5980 Β· iterative/dvc
Current state of plots​​ Plots, from the DVC perspective are files containing list of data points. It can be JSON, YAML, CSV or...
Read more >
Data Version Control: Absolute Paths and Project Paths in the ...
Paths for everything else in your stage (like params. yaml ) should be specified as relative to wdir (or relative to dvc. yaml...
Read more >
homebrew-core - Homebrew Formulae
a2ps 4.14 Any‑to‑PostScript filter aacgain 1.8 AAC‑supporting version of mp3gain aalib 1.4rc5 Portable ASCII art graphics library aamath 0.3 Renders mathematical expressions as ASCII art
Read more >
Scoop buckets by Github score
6. Calinou/scoop-games: Scoop bucket for open source/freeware games and game-related tools (scoop's built-in bucket 'games')Β ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found