Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[question] Using the monitor wrapper with already wrapped custom environment

See original GitHub issue

I am trying to wrap my custom environment using the Monitor wrapper to get additional information about the episode rewards. But since I am also wrapping the environment afterwards with my custom wrapper, the initial wrapping becomes obsolete. Is there a way to use the monitor wrapper on custom environments? I have also already seen issue #470 the answers there did not help at all.

import os
import time

from gym import Wrapper, spaces
import numpy as np
from gym.envs.classic_control import PendulumEnv

from stable_baselines.common.env_checker import check_env
from stable_baselines.sac.policies import CnnPolicy
from stable_baselines import A2C
from stable_baselines.common.vec_env import DummyVecEnv
from stable_baselines.bench import Monitor

from skimage import data, color
from skimage.transform import rescale, resize, downscale_local_mean

import tensorflow as tf


class RGBArrayAsObservationWrapper(Wrapper):
    """
    Use env.render(rgb_array) as observation
    rather than the observation environment provides
    """

    def __init__(self, env):
        # TODO this might not work before environment has been reset
        super(RGBArrayAsObservationWrapper, self).__init__(env)
        self.reset()
        dummy_obs = env.render('rgb_array')
        dummy_obs_resized = resize(dummy_obs, (dummy_obs.shape[0] // 10, dummy_obs.shape[1] // 10),
                                   anti_aliasing=True)
        # Update observation space
        # TODO assign correct low and high
        self.observation_space = spaces.Box(low=0, high=255, shape=dummy_obs_resized.shape,
                                            dtype=dummy_obs_resized.dtype)

    def reset(self, **kwargs):
        obs = self.env.reset(**kwargs)
        obs = self.env.render("rgb_array")
        obs = resize(obs, (obs.shape[0] // 10, obs.shape[1] // 10),
                     anti_aliasing=True)
        return obs

    def step(self, action):
        obs, reward, done, info = self.env.step(action)
        obs = self.env.render("rgb_array")
        obs = resize(obs, (obs.shape[0] // 10, obs.shape[1] // 10),
                     anti_aliasing=True)
        return obs, reward, done, info


# tensorboard --logdir=A2C_IMG_PENDULUM:C:\Users\meric\OneDrive\Masaüstü\TUM\Thesis\Pycharm\pioneer\a2c_pendulum_tensorboard --host localhost

log_dir = "/tmp/gym/{}".format(int(time.time()))
os.makedirs(log_dir, exist_ok=True)

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)

TEST_COUNT = 100

pendulum_env = PendulumEnv()
pendulum_env = Monitor(pendulum_env, log_dir, allow_early_resets=True)
pendulum_env = RGBArrayAsObservationWrapper(pendulum_env)
check_env(pendulum_env, warn=True)

model = A2C("CnnPolicy", pendulum_env, verbose=1, tensorboard_log="./a2c_pendulum_tensorboard/")
model.learn(total_timesteps=100_000, log_interval=10)
model.save("a2c_pendulum")

sum_rewards = 0
done = False
obs = pendulum_env.reset()
for i in range(TEST_COUNT):
    while not done:
        action, _states = model.predict(obs)
        obs, rewards, done, info = pendulum_env.step(action)
        sum_rewards += rewards

    pendulum_env.reset()
    done = False

print(sum_rewards / TEST_COUNT)

C:\Users\meric\Anaconda3\envs\pioneer\python.exe C:/Users/meric/OneDrive/Masaüstü/TUM/Thesis/Pycharm/pioneer/pendulum_image_A2C.py
2020-07-17 19:13:53.638249: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From C:/Users/meric/OneDrive/Masaüstü/TUM/Thesis/Pycharm/pioneer/pendulum_image_A2C.py:58: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From C:/Users/meric/OneDrive/Masaüstü/TUM/Thesis/Pycharm/pioneer/pendulum_image_A2C.py:60: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2020-07-17 19:13:57.789006: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-07-17 19:13:57.793476: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-07-17 19:13:57.827360: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1050 major: 6 minor: 1 memoryClockRate(GHz): 1.493
pciBusID: 0000:01:00.0
2020-07-17 19:13:57.827697: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-07-17 19:13:57.831805: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-07-17 19:13:57.835594: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-07-17 19:13:57.837423: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-07-17 19:13:57.842333: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-07-17 19:13:57.845671: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-07-17 19:13:57.854817: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-17 19:13:57.855150: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-07-17 19:13:58.672437: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-17 19:13:58.672654: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-07-17 19:13:58.672763: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-07-17 19:13:58.673026: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3001 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\env_checker.py:25: UserWarning: It seems that your observation is an image but the `dtype` of your observation_space is not `np.uint8`. If your observation is not an image, we recommend you to flatten the observation to have only a 1D vector
  warnings.warn("It seems that your observation is an image but the `dtype` "
C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\env_checker.py:210: UserWarning: We recommend you to use a symmetric and normalized Box action space (range=[-1, 1]) cf https://stable-baselines.readthedocs.io/en/master/guide/rl_tips.html
  warnings.warn("We recommend you to use a symmetric and normalized Box action space (range=[-1, 1]) "
Wrapping the env in a DummyVecEnv.
2020-07-17 19:14:00.046052: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1050 major: 6 minor: 1 memoryClockRate(GHz): 1.493
pciBusID: 0000:01:00.0
2020-07-17 19:14:00.046321: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-07-17 19:14:00.046506: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-07-17 19:14:00.046684: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-07-17 19:14:00.046906: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-07-17 19:14:00.047086: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-07-17 19:14:00.047271: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-07-17 19:14:00.047449: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-17 19:14:00.047689: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-07-17 19:14:00.047878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-17 19:14:00.048061: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-07-17 19:14:00.048175: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-07-17 19:14:00.048348: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3001 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\policies.py:116: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\input.py:25: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\tf_layers.py:103: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\distributions.py:418: The name tf.random_normal is deprecated. Please use tf.random.normal instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\a2c\a2c.py:160: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\tf_util.py:449: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\tf_util.py:449: The name tf.GraphKeys is deprecated. Please use tf.compat.v1.GraphKeys instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\tensorflow_core\python\ops\clip_ops.py:301: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\a2c\a2c.py:184: The name tf.train.RMSPropOptimizer is deprecated. Please use tf.compat.v1.train.RMSPropOptimizer instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\tensorflow_core\python\training\rmsprop.py:119: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\a2c\a2c.py:194: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\a2c\a2c.py:196: The name tf.summary.merge_all is deprecated. Please use tf.compat.v1.summary.merge_all instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\base_class.py:1169: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.

2020-07-17 19:14:01.077903: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-07-17 19:14:01.342446: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-17 19:14:02.338288: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows
Relying on driver to perform ptx compilation. This message will be only logged once.
---------------------------------
| explained_variance | 3.26e-05 |
| fps                | 2        |
| nupdates           | 1        |
| policy_entropy     | 1.42     |
| total_timesteps    | 5        |
| value_loss         | 439      |
---------------------------------

System Info Describe the characteristic of your environment: