Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[question] Understanding observation_space

See original GitHub issue

Firstly, thanks for such an amazing resource - it’s been absolutely invaluable.

(How do I add a [question] and [custom gym env] label to this???)

I’ve been really struggling with what probably amounts to a rudimentary problem and was wondering if anyone has the patience or inclination to help explain the correct use of observation_space in a custom environment.

At each step I have a numpy array of dimensions (119,7) which I’d like to use as the observation. I’m currently trying to use the ACKTR model with MlpLnLstmPolicy.

With that in mind I’ve tried a number of ways to set a Box space, but have yet to manage to crack it.

min_vals is an array taken from my ‘master’ dataframe, i.e., min_vals = master.min(axis=0).values max_vals is an array max_vals = master.max(axis=0).values observation is an array of shape (119,7) i.,e, observation = master.space

When I try to use: self.observation_space = spaces.Box(low=min_vals, high=max_vals) I get “ValueError: could not broadcast input array from shape (119,7) into shape (7)”

Using: self.observation_space = spaces.Box(low=min(min_vals), high=max(max_vals),shape =(119,7) , dtype = np.float32) I get “TypeError: Object of type ‘int64’ is not JSON serializable”

With: self.observation_space = spaces.Box(low=min_vals, high=max_vals,shape =(119,7) , dtype = np.float32) I get an AssertionError based on ‘assert np.isscalar(low) and np.isscalar(high)’

I could go on but at this point my efforts devolved into just trying random sequences. I’ve tried to compare what I’m doing to a number of other envs and I’m clearly failing to grasp the observation_space concept so any help would be really appreciated.

Issue Analytics

State:
Created 4 years ago
Comments:5 (1 by maintainers)

Top GitHub Comments

1reaction

flipflop4commented, May 2, 2019

@araffin I’ve just now managed to check this out, apologies for the delay.

Thanks so much for the help, if there’s anything I can do to help contribute let me know. You guys are rocking it, keep up the good work.

0reactions

araffincommented, Apr 27, 2019

@hill-a I think you confused observation space with action space for ACKTR.

@flipflop4 , the following code works, so you can maybe take inspiration of it 😉:

import numpy as np
import gym
from gym import spaces
from stable_baselines import ACKTR, PPO2
from stable_baselines.common.vec_env import DummyVecEnv


class CustomEnv(gym.Env):
    metadata = {'render.modes': ['human']}

    def __init__(self):
        super(CustomEnv, self).__init__()
        self.action_space = spaces.Discrete(4)
        self.observation_space = spaces.Box(-1, 1, shape=(119, 7), dtype=np.float32)
        # Alternatively, you can define your space like that:
        # highs = np.ones((119, 7), dtype=np.float32)
        # lows = -highs
        # self.observation_space = spaces.Box(lows, highs, dtype=np.float32)

    def step(self, action):
        obs = self.observation_space.sample()
        return obs, 0, False, {}
    
    def reset(self):
        return self.observation_space.sample()


env = DummyVecEnv([lambda: CustomEnv()])
model = ACKTR('MlpLstmPolicy', env, verbose=1).learn(10000)