Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Could not render SAC pendulum example given in Documentation

See original GitHub issue

I have tried Pendulum-v0 animation to run in different codes but the code of stable_baselines docs of SAC only shows me the first frame of pendulum render screen and does not animate throughout.

Here is the code in which I can see the animation:

import gym
env = gym.make('Pendulum-v0')
env.reset()

for i in range(1000):
    env.step(env.action_space.sample())
    env.render()

Here is the docs code of SAC I found and the animation freezes:

import gym
import numpy as np

from stable_baselines.sac.policies import MlpPolicy
from stable_baselines import SAC

env = gym.make('Pendulum-v0')

model = SAC(MlpPolicy, env, verbose=1)
model.learn(total_timesteps=100000, log_interval=10)
model.save("sac_pendulum")

model = SAC.load("sac_pendulum")

obs = env.reset()
while True:
    action, _states = model.predict(obs)
    obs, rewards, dones, info = env.step(action)
    env.render()

This code runs but the render screen freezes in the first frame:

System Info Stable Baselines[mpi] Installed from pip Ubuntu 16.04 Nvidia GeForce 940m Python version - 3.7.6 Tensorflow version - 1.15.0 (GPU version) Using IPython in terminal

Issue Analytics

State:
Created 3 years ago
Comments:5

Top GitHub Comments

1reaction

araffincommented, Mar 29, 2020

What I meant, this should work:


import gym
import numpy as np

from stable_baselines import SAC

env = gym.make('Pendulum-v0')

model = SAC('MlpPolicy', env, verbose=1)

# Render before training for 500 steps
obs = env.reset()
for _ in range(500):
    action, _states = model.predict(obs)
    obs, reward, done, info = env.step(action)
    env.render()
    # reset the env at the end of an episode
    if done:
        obs = env.reset()

# Train
model.learn(total_timesteps=20000, log_interval=10)
model.save("sac_pendulum")

model = SAC.load("sac_pendulum")

# Render after training
obs = env.reset()
while True:
    action, _states = model.predict(obs)
    obs, reward, done, info = env.step(action)
    env.render()
    # reset the env at the end of an episode
    if done:
        obs = env.reset()

0reactions

sprakashdashcommented, Mar 29, 2020

Thanks a lot, it worked!

Top Results From Across the Web

SAC — Stable Baselines 2.10.3a0 documentation

The SAC model does not support stable_baselines.common.policies because it uses double q-values and value estimation, as a result it must use its own...

Pendulum-v0 learned in 5 trials [Explanation in comments]

I was toying around with model-based reinforcement learning based on this paper by Janner et al.: https://arxiv.org/abs/1906.08253 .

Stable-Baselines3: Reliable Reinforcement Learning ...

You can find more examples and associated colab notebooks in the documentation. To the Infinity and Beyond! We presented Stable-Baselines3 v1.0, ...

Intro to RLlib: Example Environments | by Paco Nathan

no previous work in reinforcement learning; no previous hands-on experience with RLlib. Key takeaways: we will compare and contrast well-known ...

Environments - Agents

Python Environments · observation : This is the part of the environment state that the agent can observe to choose its actions at...