Achieving reported training performance
See original GitHub issueWe trained a few agents with the training code provided in repo. If don’t change anything the mean reward in 500 training levels in Starpilot game is around 5.6
. If we remove the VecNormalize
line in environment creation, then we achieve mean reward around 9.2
. However the reported mean reward is around 12
in the paper (Figure 4).
Did you use VecNormalize
in the paper?
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (2 by maintainers)
Top Results From Across the Web
How to Measure Training Effectiveness in 2023? | Simplilearn
Know how to measure training effectiveness and learn proven methods to ensure that the original learning goals and purpose are achieved.
Read more >Achieving performance in the training industry
Feedback and measurement demonstrate the impact training has on employee performance. We can track training status and impact through real-time ...
Read more >Getting more from your training programs - McKinsey
To improve results from training programs, executives must focus on what happens in the workplace before and after employees go to class.
Read more >Performance Training - Overview, How It Works, Benefits
Performance training is a training strategy that assists participants in achieving their targeted performance goals and/or objectives.
Read more >7 Ways To Ensure Training Results Translate to Business ...
One critical factor for performance success is feedback. Practice makes permanent, but feedback makes perfect. Coaches can use feedback to address application ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
We are able to achieve up to 18.7 mean reward in starpilot game by introducing 8 initial states.
EDIT: We achieved 18.7 at 400M training with 8 initial states. In 200M timesteps the best we achieved is 14.9 and the average is 14.16. Sorry for the inconvenience.
We achieved 13.9 reward in starpilot game by running 1 worker with 256 env with VecNormalize. Therefore I am closing this issue.