Episode Length appears capped at 1000
See original GitHub issueI’m having trouble setting the episode length above 1000. I am running the command
python -m spinup.run ppo --hid "[8,8]" --env HalfCheetah-v2 --exp_name train_doggo --gamma 0.999 --max_ep_len 2000 --steps_per_epoch 4000
to use PPO on the Half Cheetah environment with a max_ep_len set to 2000. However, when I run the algorithm, the episode length is always 1000. This is an example output:
---------------------------------------
| Epoch | 48 |
| AverageEpRet | 7.02e+03 |
| StdEpRet | 1.76e+04 |
| MaxEpRet | 3.1e+04 |
| MinEpRet | -1.74e+04 |
| EpLen | 1e+03 |
| AverageVVals | 19.1 |
| StdVVals | 31.2 |
| MaxVVals | 40 |
| MinVVals | -33.1 |
| TotalEnvInteracts | 1.96e+05 |
| LossPi | 2.81e-08 |
| LossV | 1.09e+08 |
| DeltaLossPi | -0.0105 |
| DeltaLossV | -1.07e+04 |
| Entropy | 3.52 |
| KL | 0.0121 |
| ClipFrac | 0.103 |
| StopIter | 79 |
| Time | 127 |
---------------------------------------
If I set the max_ep_len to less than 1000, say 500, the EpLen does in fact match the max_ep_len. Additionally, the half cheetah environment always returns done=False, and I am not getting the “Trajectory cut off by epoch” warning message.
Any help would be appreciated in solving this issue!
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:5 (3 by maintainers)
Top Results From Across the Web
How Easy Is It to Burn Through a 1TB Data Cap?
Using Rayburn's number, in a single month you'd have to stream 416 Netflix videos of 90 minutes each to hit a 1TB data...
Read more >List of The 100 episodes - Wikipedia
The 100 is an American post-apocalyptic science fiction drama television series developed by Jason Rothenberg, which premiered on March 19, 2014, on The...
Read more >Spectrum Guide: On Demand Troubleshooting
Showing Content for 94043 ZIP Code Edit. Spectrum Guide displaying movies available to watch On Demand. If you're having problems accessing On Demand ......
Read more >Episode Guide - 1000 Ways To Die Wiki - Fandom
This episode guide might only be seen on some separate and organized seasons of every segment episodes. Life Will Kill You Hard Lives,...
Read more >The 100 (TV Series 2014–2020) - IMDb
Episodes100 ... Season 3 Comic-Con Re-Cap ... From all-consuming passion to forbidden encounters, these love stories will stand the test of time.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @Nate711,
Gym wraps the MuJoCo envs by default with TimeLimit environment wrappers. These limit Cheetah, and several other MuJoCo envs, to max length of 1000 regardless of what you choose as
max_ep_len
for the algorithm. See here for the Gym registry entry for Cheetah, which specifies themax_episode_steps
. (The algorithm only has the power to curtail episode length—it can’t make an episode longer than the env permits.)You’re not getting “Trajectory cut off by epoch” warnings because the episodes are returning
done=True
at the 1000th timestep, and 1000 divides evenly into your batch size, so the last timestep of the batch isdone=True
.Hope this helps!
Hi @Nate711, I hope I’m not interrupting at an inconvenient time, but I would like to know whether you used the spinningup’s ppo on your test run. I am trying to implement the PPO from scratch. The problem is it seems not to work well with HalfCheetah-v2 with the default parameters that spinningup’s ppo uses. I have even tried to run it with spinningup’s version combined with your settings (using hidden_sizes [8, 8]) and the result was about the same, i.e. the average return after 50 epochs (200 env interactions) is around 200-300. Thank you, in advance and have a nice day