question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Feature] Auto generate code to reproduce run

See original GitHub issue

Overview

Currently, the wandb client saves an excellent collection of data for the purpose of reproducibility such as requirements.txt, the git repo and state, and the command that was used to run the experiments. However, it still would take leg work to reproduce the experiments: the user would need to manually clone the repo, checkout the branch, setup venv, run the experiments.

Describe the solution you’d like It would be great to have auto generated commands to reproduce the run: something the users could just copy and past to the terminal and reproduce the experiment. For example, consider this run, using the data in the repo it is possible to auto generate code to reproduce this specific experiments, such as the following:

python -m venv venv
source venv/bin/activate
pip install -r https://api.wandb.ai/files/cleanrl/cleanrl.benchmark/1kpt8dsa/requirements.txt
git clone https://github.com/vwxyzjn/cleanrl
git checkout -b "BreakoutNoFrameskip-v4__ppo_atari_visual__2__1591793872" a6d0a625ac7175e01b0562d281ea3429e69aae69
/opt/conda/bin/python ppo_atari_visual.py --gym-id BreakoutNoFrameskip-v4 --total-timesteps 10000000 --wandb-project-name cleanrl.benchmark --wandb-entity cleanrl --prod-mode --capture-video --seed 2

If the code-saving is enabled, it should also be possible to reproduce with

python -m venv venv
source venv/bin/activate
pip install -r https://api.wandb.ai/files/cleanrl/cleanrl.benchmark/1kpt8dsa/requirements.txt
curl -OL https://api.wandb.ai/files/cleanrl/cleanrl.benchmark/1kpt8dsa/code/cleanrl/ppo_atari_visual.py
python ppo_atari_visual.py --gym-id BreakoutNoFrameskip-v4 --total-timesteps 10000000 --wandb-project-name cleanrl.benchmark --wandb-entity cleanrl --prod-mode --capture-video --seed 2

The code samples above should work for linux and mac, for windows a different set of commands might be required. And maybe put those code here:

image

image

Subtleness to this issue:

There are some subtleness to this issue. Here are something I can think of.

  1. Sometimes people use conda to manage the experiments, which means the requirements.txt is going to have things like conda-build==3.18.11 that is not really installable through PyPi. This means directly installing the requirements.txt will fail. The correct way to handle this I think is to record the conda env yaml file if conda is present and install the conda env prior to the installing the requirements.txt. Like
curl -OL https://api.wandb.ai/files/cleanrl/cleanrl.benchmark/1kpt8dsa/conda_environment.yml # the client needs to auto save conda_environment.yml to make it work
conda env update --name venv --file conda_environment.yml
conda activate venv
pip install -r https://api.wandb.ai/files/cleanrl/cleanrl.benchmark/1kpt8dsa/requirements.txt
git clone https://github.com/vwxyzjn/cleanrl
git checkout -b "BreakoutNoFrameskip-v4__ppo_atari_visual__2__1591793872" a6d0a625ac7175e01b0562d281ea3429e69aae69
/opt/conda/bin/python ppo_atari_visual.py --gym-id BreakoutNoFrameskip-v4 --total-timesteps 10000000 --wandb-project-name cleanrl.benchmark --wandb-entity cleanrl --prod-mode --capture-video --seed 2
  1. Some packages are installed in editable mode (i.e. a local installation of the package that is not available in PyPi), this will also cause the requirements.txt installation to crash. One solution I can think of is to just warn the user if editable packages are detected. See https://stackoverflow.com/questions/42582801/check-whether-a-python-package-has-been-installed-in-editable-egg-link-mode

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
vwxyzjncommented, Apr 15, 2021

Thank you Aritra 😃

Btw on second thought, I think although saving conda’s environment.yaml will be ultimately helpful, conda might have some cross-platform issues… Maybe the best practice is still to do use venv, like

python -m venv venv
source venv/bin/activate
#.... do stuff like install dependencies
python experiment.py

That way when reproducing it, it can be as simple as something like

python -m venv venv
source venv/bin/activate
pip install -r https://api.wandb.ai/files/cleanrl/cleanrl.benchmark/1kpt8dsa/requirements.txt
git clone https://github.com/vwxyzjn/cleanrl
git checkout -b "BreakoutNoFrameskip-v4__ppo_atari_visual__2__1591793872" a6d0a625ac7175e01b0562d281ea3429e69aae69
python experiment.py
0reactions
sydhollcommented, Dec 7, 2021

@vwxyzjn Hi Costa! This feature was implemented in June 2021 and we are closing this ticket.

Read more comments on GitHub >

github_iconTop Results From Across the Web

[Feature] Auto generate code to reproduce run #2070 - GitHub
Overview Currently, the wandb client saves an excellent collection of data for the purpose of reproducibility such as requirements.txt, ...
Read more >
r - Automatically generate command to reproduce an object in ...
dput(A) returns the structure of the object A . It can then be used to recreate A directly, or to share code for...
Read more >
Generating R code to make it reproducible even outside of ...
Select a step, which you want to reproduce in R, inside the data wrangling steps at right hand side, and select 'Generate R...
Read more >
How to Build a Code Generator - YouTube
In this drive by code session WaiKai and I show you how to write a code generator : a program that spits out...
Read more >
Understanding and Improving Code Generation - Databricks
Code generation is integral to Spark's physical execution engine. When implemented, the Spark engine creates optimized bytecode at runtime improving ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found