question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[question] Using the monitor wrapper with already wrapped custom environment

See original GitHub issue

I am trying to wrap my custom environment using the Monitor wrapper to get additional information about the episode rewards. But since I am also wrapping the environment afterwards with my custom wrapper, the initial wrapping becomes obsolete. Is there a way to use the monitor wrapper on custom environments? I have also already seen issue #470 the answers there did not help at all.

import os
import time

from gym import Wrapper, spaces
import numpy as np
from gym.envs.classic_control import PendulumEnv

from stable_baselines.common.env_checker import check_env
from stable_baselines.sac.policies import CnnPolicy
from stable_baselines import A2C
from stable_baselines.common.vec_env import DummyVecEnv
from stable_baselines.bench import Monitor

from skimage import data, color
from skimage.transform import rescale, resize, downscale_local_mean

import tensorflow as tf


class RGBArrayAsObservationWrapper(Wrapper):
    """
    Use env.render(rgb_array) as observation
    rather than the observation environment provides
    """

    def __init__(self, env):
        # TODO this might not work before environment has been reset
        super(RGBArrayAsObservationWrapper, self).__init__(env)
        self.reset()
        dummy_obs = env.render('rgb_array')
        dummy_obs_resized = resize(dummy_obs, (dummy_obs.shape[0] // 10, dummy_obs.shape[1] // 10),
                                   anti_aliasing=True)
        # Update observation space
        # TODO assign correct low and high
        self.observation_space = spaces.Box(low=0, high=255, shape=dummy_obs_resized.shape,
                                            dtype=dummy_obs_resized.dtype)

    def reset(self, **kwargs):
        obs = self.env.reset(**kwargs)
        obs = self.env.render("rgb_array")
        obs = resize(obs, (obs.shape[0] // 10, obs.shape[1] // 10),
                     anti_aliasing=True)
        return obs

    def step(self, action):
        obs, reward, done, info = self.env.step(action)
        obs = self.env.render("rgb_array")
        obs = resize(obs, (obs.shape[0] // 10, obs.shape[1] // 10),
                     anti_aliasing=True)
        return obs, reward, done, info


# tensorboard --logdir=A2C_IMG_PENDULUM:C:\Users\meric\OneDrive\Masaüstü\TUM\Thesis\Pycharm\pioneer\a2c_pendulum_tensorboard --host localhost

log_dir = "/tmp/gym/{}".format(int(time.time()))
os.makedirs(log_dir, exist_ok=True)

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)

TEST_COUNT = 100

pendulum_env = PendulumEnv()
pendulum_env = Monitor(pendulum_env, log_dir, allow_early_resets=True)
pendulum_env = RGBArrayAsObservationWrapper(pendulum_env)
check_env(pendulum_env, warn=True)

model = A2C("CnnPolicy", pendulum_env, verbose=1, tensorboard_log="./a2c_pendulum_tensorboard/")
model.learn(total_timesteps=100_000, log_interval=10)
model.save("a2c_pendulum")

sum_rewards = 0
done = False
obs = pendulum_env.reset()
for i in range(TEST_COUNT):
    while not done:
        action, _states = model.predict(obs)
        obs, rewards, done, info = pendulum_env.step(action)
        sum_rewards += rewards

    pendulum_env.reset()
    done = False

print(sum_rewards / TEST_COUNT)


C:\Users\meric\Anaconda3\envs\pioneer\python.exe C:/Users/meric/OneDrive/Masaüstü/TUM/Thesis/Pycharm/pioneer/pendulum_image_A2C.py
2020-07-17 19:13:53.638249: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From C:/Users/meric/OneDrive/Masaüstü/TUM/Thesis/Pycharm/pioneer/pendulum_image_A2C.py:58: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From C:/Users/meric/OneDrive/Masaüstü/TUM/Thesis/Pycharm/pioneer/pendulum_image_A2C.py:60: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2020-07-17 19:13:57.789006: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-07-17 19:13:57.793476: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-07-17 19:13:57.827360: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1050 major: 6 minor: 1 memoryClockRate(GHz): 1.493
pciBusID: 0000:01:00.0
2020-07-17 19:13:57.827697: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-07-17 19:13:57.831805: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-07-17 19:13:57.835594: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-07-17 19:13:57.837423: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-07-17 19:13:57.842333: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-07-17 19:13:57.845671: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-07-17 19:13:57.854817: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-17 19:13:57.855150: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-07-17 19:13:58.672437: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-17 19:13:58.672654: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-07-17 19:13:58.672763: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-07-17 19:13:58.673026: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3001 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\env_checker.py:25: UserWarning: It seems that your observation is an image but the `dtype` of your observation_space is not `np.uint8`. If your observation is not an image, we recommend you to flatten the observation to have only a 1D vector
  warnings.warn("It seems that your observation is an image but the `dtype` "
C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\env_checker.py:210: UserWarning: We recommend you to use a symmetric and normalized Box action space (range=[-1, 1]) cf https://stable-baselines.readthedocs.io/en/master/guide/rl_tips.html
  warnings.warn("We recommend you to use a symmetric and normalized Box action space (range=[-1, 1]) "
Wrapping the env in a DummyVecEnv.
2020-07-17 19:14:00.046052: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1050 major: 6 minor: 1 memoryClockRate(GHz): 1.493
pciBusID: 0000:01:00.0
2020-07-17 19:14:00.046321: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-07-17 19:14:00.046506: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-07-17 19:14:00.046684: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-07-17 19:14:00.046906: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-07-17 19:14:00.047086: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-07-17 19:14:00.047271: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-07-17 19:14:00.047449: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-17 19:14:00.047689: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-07-17 19:14:00.047878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-17 19:14:00.048061: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-07-17 19:14:00.048175: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-07-17 19:14:00.048348: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3001 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\policies.py:116: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\input.py:25: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\tf_layers.py:103: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\distributions.py:418: The name tf.random_normal is deprecated. Please use tf.random.normal instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\a2c\a2c.py:160: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\tf_util.py:449: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\tf_util.py:449: The name tf.GraphKeys is deprecated. Please use tf.compat.v1.GraphKeys instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\tensorflow_core\python\ops\clip_ops.py:301: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\a2c\a2c.py:184: The name tf.train.RMSPropOptimizer is deprecated. Please use tf.compat.v1.train.RMSPropOptimizer instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\tensorflow_core\python\training\rmsprop.py:119: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\a2c\a2c.py:194: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\a2c\a2c.py:196: The name tf.summary.merge_all is deprecated. Please use tf.compat.v1.summary.merge_all instead.

WARNING:tensorflow:From C:\Users\meric\Anaconda3\envs\pioneer\lib\site-packages\stable_baselines\common\base_class.py:1169: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.

2020-07-17 19:14:01.077903: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-07-17 19:14:01.342446: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-17 19:14:02.338288: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows
Relying on driver to perform ptx compilation. This message will be only logged once.
---------------------------------
| explained_variance | 3.26e-05 |
| fps                | 2        |
| nupdates           | 1        |
| policy_entropy     | 1.42     |
| total_timesteps    | 5        |
| value_loss         | 439      |
---------------------------------

System Info Describe the characteristic of your environment:

  • Describe how stable baselines was installed (pip, docker, source, …): source
  • GPU models and configuration: NVIDIA GTX 1050 with CUDA 10.0 and cuDNN 7.6.5
  • Python version: 3.7
  • Tensorflow version: 1.15

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:10

github_iconTop GitHub Comments

3reactions
araffincommented, Jul 19, 2020

The pendulum env have no timelimit by default, you need to wrap it with a Timelimit wrapper or use gym.make('Pendulum-v0') to have episodes.

0reactions
araffincommented, Jul 19, 2020

if you want to solve Pendulum-v0, better to use a off-policy algorithm (sac/td3) and hyperparameters from the zoo.

Closing this as the original question was answered.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Extending OpenAI Gym environments with Wrappers and ...
In this article we are going to discuss two OpenAI Gym functionalities; Wrappers and Monitors. These functionalities are present in OpenAI ...
Read more >
Monitor Wrapper — Stable Baselines 2.10.3a0 documentation
A monitor wrapper for Gym environments, it is used to know the episode reward, length, time and other data. Parameters: env – (gym.Env)...
Read more >
Stable Baselines3 Tutorial - Gym wrappers, saving and ...
In this notebook, you will learn how to use Gym Wrappers which allow to do monitoring, normalization, limit the number of steps, feature...
Read more >
In gym, how should we implement the environment's render ...
In my specific case, I am using a simple custom environment (i.e. a very simple grid world/maze), where I return a NumPy array...
Read more >
Questions and Answers for Wrap Shops - 3M
Using 3M™ Wrap Film Series 1080 & 2080 ... 3M now offers ... Applying 3M™ Wrap Overlaminate Series 8900 can alter the the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found