question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Why does env.render() create multiple render screens? | LSTM policy predict with one env [question]

See original GitHub issue

When I run the code example from the docs for cartpole multiprocessing, it renders one window with all env’s playing the game. It also renders individual windows with the same env’s playing the same games.

import gym
import numpy as np

from stable_baselines.common.policies import MlpPolicy
from stable_baselines.common.vec_env import SubprocVecEnv
from stable_baselines.common import set_global_seeds
from stable_baselines import ACKTR

def make_env(env_id, rank, seed=0):
    """
    Utility function for multiprocessed env.

    :param env_id: (str) the environment ID
    :param num_env: (int) the number of environments you wish to have in subprocesses
    :param seed: (int) the inital seed for RNG
    :param rank: (int) index of the subprocess
    """
    def _init():
        env = gym.make(env_id)
        env.seed(seed + rank)
        return env
    set_global_seeds(seed)
    return _init

env_id = "CartPole-v1"
num_cpu = 4  # Number of processes to use
# Create the vectorized environment
env = SubprocVecEnv([make_env(env_id, i) for i in range(num_cpu)])

model = ACKTR(MlpPolicy, env, verbose=1)
model.learn(total_timesteps=25000)

obs = env.reset()
for _ in range(1000):
    action, _states = model.predict(obs)
    obs, rewards, dones, info = env.step(action)
    env.render()

System Info Describe the characteristic of your environment:

  • Vanilla install, followed the docs using pip
  • gpus: 2-gtx-1080ti’s
  • Python version 3.6.5
  • Tensorflow version 1.12.0
  • ffmpeg 4.0

Additional context cartpole

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:24 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
hn2commented, Jun 18, 2019

n_env = 12 env = PortfolioEnv(history=history, abbreviation=instruments, steps=settings[‘steps’], window_length=settings[‘window_length’], include_ta=settings[‘include_ta’],allow_short=settings[‘allow_short’], reward=settings[‘reward’], debug=settings[‘debug’]) env = SubprocVecEnv([lambda: env for _ in range(1)])

mdl = 'ES_19900102_20180101_5000000_7000_1_return_False_7a686c53e4a34338942a8b4bbe65fa47'
model = PPO2.load(mdl)

# intialized here
obs = env.reset()
zero_completed_obs = np.zeros((n_env,) + env.observation_space.shape)
zero_completed_obs[0, :] = obs

state = None
state = model.initial_state   
done = np.zeros(state.shape[0])   

pos, state = model.predict(zero_completed_obs, state, done)

Traceback (most recent call last): File “C:\Users\hanna\Anaconda3\lib\site-packages\quantiacsToolbox\quantiacsToolbox.py”, line 871, in runts position, settings = TSobject.myTradingSystem(*argList) File “ppo2_quantiacs_test.py”, line 68, in myTradingSystem pos, state = model.predict(zero_completed_obs, state, done) File “C:\Users\hanna\Anaconda3\lib\site-packages\stable_baselines\common\base_class.py”, line 469, in predict vectorized_env = self._is_vectorized_observation(observation, self.observation_space) File “C:\Users\hanna\Anaconda3\lib\site-packages\stable_baselines\common\base_class.py”, line 399, in _is_vectorized_observation .format(", ".join(map(str, observation_space.shape)))) ValueError: Error: Unexpected observation shape (12, 5) for Box environment, please use (10,) or (n_env, 10) for the observation shape.

1reaction
hill-acommented, Jun 18, 2019

Still have a problem in

pos, state = model.predict(zero_completed_obs, state, done)

ValueError: Error: Unexpected observation shape (12, 5) for Box environment, please use (10,) or (n_env, 10) for the observation shape.

Model was trained with n_env = 12

Where this 10 comes from?

A few things:

Your issue will not be addressed if you do not follow the format described in the issue template (https://github.com/hill-a/stable-baselines/blob/master/.github/ISSUE_TEMPLATE/issue-template.md)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Create custom gym environments from scratch — A stock ...
OpenAI's gym is an awesome package that allows you to create custom reinforcement learning agents. ... Render the environment to the screen
Read more >
Whenever I try to use env.render() for OpenAIgym I get ...
I have tried it but the problem is when I use env.reset(). It creates a pop up windows. and my kernel gets stuck...
Read more >
SAC — Stable Baselines3 1.7.0a5 documentation
In our implementation, we use an entropy coefficient (as in OpenAI Spinning or Facebook Horizon), which is the equivalent to the inverse of...
Read more >
Reinforcement Learning (DQN) Tutorial
This tutorial shows how to use PyTorch to train a Deep Q Learning (DQN) agent on the CartPole-v0 task from the OpenAI Gym....
Read more >
Rendering Environment Outputs & Q-Learning
Setting the Google Colab rendering mechanism for OpenAI Gym. In this article you'll learn how to render OpenAI environment outputs with env.step() ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found