Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Feature] Auto generate code to reproduce run

See original GitHub issue

Overview

Currently, the wandb client saves an excellent collection of data for the purpose of reproducibility such as requirements.txt, the git repo and state, and the command that was used to run the experiments. However, it still would take leg work to reproduce the experiments: the user would need to manually clone the repo, checkout the branch, setup venv, run the experiments.

Describe the solution you’d like It would be great to have auto generated commands to reproduce the run: something the users could just copy and past to the terminal and reproduce the experiment. For example, consider this run, using the data in the repo it is possible to auto generate code to reproduce this specific experiments, such as the following:

python -m venv venv
source venv/bin/activate
pip install -r https://api.wandb.ai/files/cleanrl/cleanrl.benchmark/1kpt8dsa/requirements.txt
git clone https://github.com/vwxyzjn/cleanrl
git checkout -b "BreakoutNoFrameskip-v4__ppo_atari_visual__2__1591793872" a6d0a625ac7175e01b0562d281ea3429e69aae69
/opt/conda/bin/python ppo_atari_visual.py --gym-id BreakoutNoFrameskip-v4 --total-timesteps 10000000 --wandb-project-name cleanrl.benchmark --wandb-entity cleanrl --prod-mode --capture-video --seed 2

If the code-saving is enabled, it should also be possible to reproduce with

python -m venv venv
source venv/bin/activate
pip install -r https://api.wandb.ai/files/cleanrl/cleanrl.benchmark/1kpt8dsa/requirements.txt
curl -OL https://api.wandb.ai/files/cleanrl/cleanrl.benchmark/1kpt8dsa/code/cleanrl/ppo_atari_visual.py
python ppo_atari_visual.py --gym-id BreakoutNoFrameskip-v4 --total-timesteps 10000000 --wandb-project-name cleanrl.benchmark --wandb-entity cleanrl --prod-mode --capture-video --seed 2

The code samples above should work for linux and mac, for windows a different set of commands might be required. And maybe put those code here:

Subtleness to this issue:

There are some subtleness to this issue. Here are something I can think of.

Sometimes people use conda to manage the experiments, which means the requirements.txt is going to have things like conda-build==3.18.11 that is not really installable through PyPi. This means directly installing the requirements.txt will fail. The correct way to handle this I think is to record the conda env yaml file if conda is present and install the conda env prior to the installing the requirements.txt. Like

curl -OL https://api.wandb.ai/files/cleanrl/cleanrl.benchmark/1kpt8dsa/conda_environment.yml # the client needs to auto save conda_environment.yml to make it work
conda env update --name venv --file conda_environment.yml
conda activate venv
pip install -r https://api.wandb.ai/files/cleanrl/cleanrl.benchmark/1kpt8dsa/requirements.txt
git clone https://github.com/vwxyzjn/cleanrl
git checkout -b "BreakoutNoFrameskip-v4__ppo_atari_visual__2__1591793872" a6d0a625ac7175e01b0562d281ea3429e69aae69
/opt/conda/bin/python ppo_atari_visual.py --gym-id BreakoutNoFrameskip-v4 --total-timesteps 10000000 --wandb-project-name cleanrl.benchmark --wandb-entity cleanrl --prod-mode --capture-video --seed 2

Some packages are installed in editable mode (i.e. a local installation of the package that is not available in PyPi), this will also cause the requirements.txt installation to crash. One solution I can think of is to just warn the user if editable packages are detected. See https://stackoverflow.com/questions/42582801/check-whether-a-python-package-has-been-installed-in-editable-egg-link-mode

Issue Analytics

State:
Created 2 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

vwxyzjncommented, Apr 15, 2021

Thank you Aritra 😃

Btw on second thought, I think although saving conda’s environment.yaml will be ultimately helpful, conda might have some cross-platform issues… Maybe the best practice is still to do use venv, like

python -m venv venv
source venv/bin/activate
#.... do stuff like install dependencies
python experiment.py

That way when reproducing it, it can be as simple as something like

python -m venv venv
source venv/bin/activate
pip install -r https://api.wandb.ai/files/cleanrl/cleanrl.benchmark/1kpt8dsa/requirements.txt
git clone https://github.com/vwxyzjn/cleanrl
git checkout -b "BreakoutNoFrameskip-v4__ppo_atari_visual__2__1591793872" a6d0a625ac7175e01b0562d281ea3429e69aae69
python experiment.py

0reactions

sydhollcommented, Dec 7, 2021

@vwxyzjn Hi Costa! This feature was implemented in June 2021 and we are closing this ticket.

Top Results From Across the Web

[Feature] Auto generate code to reproduce run #2070 - GitHub

Overview Currently, the wandb client saves an excellent collection of data for the purpose of reproducibility such as requirements.txt, ...

r - Automatically generate command to reproduce an object in ...

dput(A) returns the structure of the object A . It can then be used to recreate A directly, or to share code for...

Generating R code to make it reproducible even outside of ...

Select a step, which you want to reproduce in R, inside the data wrangling steps at right hand side, and select 'Generate R...

How to Build a Code Generator - YouTube

In this drive by code session WaiKai and I show you how to write a code generator : a program that spits out...

Understanding and Improving Code Generation - Databricks

Code generation is integral to Spark's physical execution engine. When implemented, the Spark engine creates optimized bytecode at runtime improving ...