exp run: python: can't open file 'src/train.py'
See original GitHub issueBug Report
Description
I added a couple of experiments with different parameters to queue. I wanted to run them sequentially with dvc exp run --run-all but I got some unexpected results:
-
I realize
dvc checkoutis runned automatically for each experiment. I think this is bad since I have to wait all my images and annotations were βcheckoutβ once per experiment. I did not know why this was happening until I read this line in the source code: https://github.com/iterative/dvc/blob/master/dvc/command/experiments.py#L1293 That implies--tempmeans that βeach exp is runned in a separate temporary directory instead of your workspaceβ. I guess this is related with parallelizing different runsβ¦ but I consider it should be possible to avoid it since it does not scale with large datasets. -
The error I am pointing in the issue name
python: can't open file 'src/train.py'is appearing because of--tempparameters that forces to commit in both dvc and git thesrc/train.pyso file is present later in the tmp copy (this is only my current belief, I could not confirm this yet) Everything works fine withdvc repro.
Reproduce
- dvc init
- import and add dataset
- create
src/train.pyscript - dvc exp run --queue
- dvc exp run --queue -S optimizer.lr=0.1
- dvc exp run --run-all
Expected
I expected that my experiment runned in sequential order without duplicating my data in another tmp directory.
Environment information
Output of dvc doctor:
$ dvc doctor
DVC version: 2.5.4 (conda)
---------------------------------
Platform: Python 3.8.10 on Linux-4.15.0-96-generic-x86_64-with-glibc2.10
Supports:
gdrive (pydrive2 = 1.9.1),
http (requests = 2.26.0),
https (requests = 2.26.0)
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/sdb1
Caches: local
Remotes: None
Workspace directory: ext4 on /dev/sdb1
Repo: dvc, git
Additional Information (if any):
$ dvc exp show
ββββββββββββββββ³βββββββββββββββ³βββββββββββββββββββ³βββββββββββββββ
β Experiment β Created β test_split.split β optimizer.lr β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β workspace β - β - β - β
β main β Aug 19, 2021 β - β - β
β βββ *098b5bc β 01:07 PM β 0.1 β 0.1 β
β βββ *6980253 β 01:06 PM β - β - β
ββββββββββββββββ΄βββββββββββββββ΄βββββββββββββββββββ΄βββββββββββββββ
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (4 by maintainers)

Top Related StackOverflow Question
This might be related: https://github.com/iterative/dvc/issues/6490
This is expected behavior since
src/train.pyis not tracked by git - from theexp rundocs (https://dvc.org/doc/command-reference/exp/run#queueing-and-parallel-execution):As you noted, the
--queue/--run-allfunctionality is limited right now, and only supports running experiments in their own separate tempdir workspaces. This this behavior is also documented:To run an experiment in your main repo working directory, currently you cannot use the
--queue/--run-allfunctionality. But if you just doit will run in your workspace the same way as
dvc repro(and it will work properly even with an untrackedsrc/train.pyfile).