[question] Monitoring a custom environment
See original GitHub issueI am training an A2C algorithm on a custom environment using multiprocessing and SubprocVecEnv as follows:
`env = SubprocVecEnv([lambda: CustomEnv(args, i) for i in range(args.cpus)])
model = A2C(MlpLnLstmPolicy, env, verbose=1, tensorboard_log=None, learning_rate=7e-3, lr_schedule="linear")
model.learn(total_timesteps=args.training_steps, log_interval=10)
I want to monitor the learning and save model checkpoints using a Monitor and callbacks, however I can’t seem to figure out how to combine everything. I’ve tried doing
`env = SubprocVecEnv([lambda: CustomEnv(args, i) for i in range(args.cpus)]) env = Monitor(env, log_dir, allow_early_resets=True)
model = A2C(MlpLnLstmPolicy, env, verbose=1, tensorboard_log=None, learning_rate=7e-3, lr_schedule="linear")
model.learn(total_timesteps=args.training_steps, log_interval=10, callback=callback)`
but I get the following exception:
Traceback (most recent call last): File “/main_file.py”, line 96, in <module> env = train(args) File “/main_file.py”, line 76, in train env = Monitor(env, log_dir, allow_early_resets=True) File “/miniconda3/envs/RL/lib/python3.7/site-packages/stable_baselines/bench/monitor.py”, line 27, in init Wrapper.init(self, env=env) File “/miniconda3/envs/RL/lib/python3.7/site-packages/gym/core.py”, line 210, in init self.reward_range = self.env.reward_range AttributeError: ‘SubprocVecEnv’ object has no attribute ‘reward_range’
So what is the correct way of using a monitor in this setting?
Issue Analytics
- State:
- Created 4 years ago
- Reactions:12
- Comments:9
Top GitHub Comments
I also ran into similar problems with monitoring vectorized environments. It was straight forward to tweak VecMonitor from OpenAI Baselines to work with Stable Baselines, as suggested above. This is what I ended up with.
So in SB3 simply:
instead of
sorry for off-topic, googled the error and got here