exp apply: not working for failed experiments
See original GitHub issueBug Report
Description
When an experiment fails, and I want to change something and re-run it, I first need to apply it, make changes and add the new version to the queue. Unfortunately, dvc exp apply <hash>
on a failed experiment has the following result:
2022-08-25 09:47:00,888 ERROR: '20ea06d' does not appear to be an experiment commit.: Experiment derived from 'celeryf', expected '3b0d8e3'.
------------------------------------------------------------
Traceback (most recent call last):
File "/home/maciej/venvs/motor-generative-modelling/lib/python3.9/site-packages/dvc/repo/experiments/apply.py", line 38, in apply
exps.check_baseline(exp_rev)
File "/home/maciej/venvs/motor-generative-modelling/lib/python3.9/site-packages/dvc/repo/experiments/__init__.py", line 378, in check_baseline
raise BaselineMismatchError(exp_baseline, baseline_sha)
dvc.repo.experiments.exceptions.BaselineMismatchError: Experiment derived from 'celeryf', expected '3b0d8e3'.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/maciej/venvs/motor-generative-modelling/lib/python3.9/site-packages/dvc/cli/__init__.py", line 185, in main
ret = cmd.do_run()
File "/home/maciej/venvs/motor-generative-modelling/lib/python3.9/site-packages/dvc/cli/command.py", line 22, in do_run
return self.run()
File "/home/maciej/venvs/motor-generative-modelling/lib/python3.9/site-packages/dvc/commands/experiments/apply.py", line 14, in run
self.repo.experiments.apply(
File "/home/maciej/venvs/motor-generative-modelling/lib/python3.9/site-packages/dvc/repo/experiments/__init__.py", line 499, in apply
return apply(self.repo, *args, **kwargs)
File "/home/maciej/venvs/motor-generative-modelling/lib/python3.9/site-packages/dvc/repo/__init__.py", line 48, in wrapper
return f(repo, *args, **kwargs)
File "/home/maciej/venvs/motor-generative-modelling/lib/python3.9/site-packages/dvc/repo/scm_context.py", line 156, in run
return method(repo, *args, **kw)
File "/home/maciej/venvs/motor-generative-modelling/lib/python3.9/site-packages/dvc/repo/experiments/apply.py", line 40, in apply
raise InvalidExpRevError(rev) from exc
dvc.repo.experiments.exceptions.InvalidExpRevError: '20ea06d' does not appear to be an experiment commit.
------------------------------------------------------------
2022-08-25 09:47:00,891 DEBUG: Analytics is enabled.
2022-08-25 09:47:00,917 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmpmqf25q63']'
2022-08-25 09:47:00,919 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmpmqf25q63']
Reproduce
I can not share my code, and I don’t think preparing a toy example is needed here.
Expected
Changes to code/configuration files are applied in the workspace as they have been scheduled for execution.
Environment information
Output of dvc doctor
:
DVC version: 2.18.1 (pip)
---------------------------------
Platform: Python 3.9.5 on Linux-5.4.0-124-generic-x86_64-with-glibc2.31
Supports:
azure (adlfs = 2022.4.0, knack = 0.9.0, azure-identity = 1.10.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.5.1),
https (aiohttp = 3.8.1, aiohttp-retry = 2.5.1),
s3 (s3fs = 2022.5.0, boto3 = 1.21.21),
webhdfs (fsspec = 2022.5.0)
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/mapper/ubuntu--vg-home--vg
Caches: local
Remotes: s3
Workspace directory: ext4 on /dev/mapper/ubuntu--vg-home--vg
Repo: dvc, git
Kind regards, macio232
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:5
Top Results From Across the Web
How to Troubleshoot Experiments that Just Aren't Working
When you do hit a plateau in moving an experiment in the lab forward, here are actionable steps you can take to turn...
Read more >How to Deal With a Failed Experiment - Bitesize Bio
When dealing with a failed experiment, one of the best things you can do is take a break. You might be tempted to...
Read more >Experiments for AWS FIS - AWS Fault Injection Simulator
You cannot resume a stopped or failed experiment. You also cannot rerun a completed experiment. However, you can start a new experiment from...
Read more >Hypothesis Trouble: What to do when a science project fails
If, after carefully reviewing the science project, you have reason to believe there was a problem (an error in the experiment or in...
Read more >PhD tips – Dealing with “failed” experiments - Elisa Granato
Treatment repeatedly has no effect compared to control · Noise / differences between independent biological replicates very high · Mistake made in experiment...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
For now, there is a little bit hacky method to check out it. You can try it with
exp apply
needs to check the failed refs stash now. This usage of apply worked before the celery changes (since failed exps were just re-added to the regular queue), so this should be considered a regression (and it’s a simple fix)https://github.com/iterative/dvc/blob/063eb6904dc79c2e5be9e1b57f7ecaa781eded8b/dvc/repo/experiments/apply.py#L42
this just needs to be something like
(apply doesn’t
pop
from the stash, we just need to check that the git SHA exists in one of our stashes)