Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question] agent got stucked always in the action space lower bound

See original GitHub issue

Important Note: We do not do technical support, nor consulting and don’t answer personal questions per email. Please post your question on the RL Discord, Reddit or Stack Overflow in that case.

`Question`

Hi everyone,

When I use the following space actions configurations, i sample vector action with different values. But I created a custom gym Environment with the same action space configuration. I set up the baselines3 by using A2C or PPO, but for all sampled actions it seems the agent got stucked always in the action space lower bound (5.0). I was expecting something like:

21.08086 20.020802 16.812733 23.77745 10.687413 20.424904 15.4278145 26.068079 18.092493 22.096527 ] [ 5.002933 8.210208 15.631343 5.3958955 29.201706 27.193197 21.82524 25.94392 33.925514 30.831163 ]

What am I doing wrong?

`import numpy as np from gym import spaces

low_actions = [] high_actions = [] for _ in range(10): low_actions.append(5.0) high_actions.append(35.0)

action_space = spaces.Box(low=np.array(low_actions), high=np.array(high_actions)) # steer, gas, brake for _ in range(1000): print(action_space.sample())`

Additional `context`

Add any other context about the question here.

Checklist

I have read the documentation (required)
I have checked that there is no similar issue in the repo (required)

Issue Analytics

State:
Created 2 years ago
Comments:5 (2 by maintainers)

Top GitHub Comments

1reaction

hkuribayashicommented, Apr 11, 2021

i highly suspect your issue is the one mentioned in our tips and tricks: https://stable-baselines3.readthedocs.io/en/master/guide/rl_tips.html#tips-and-tricks-when-creating-a-custom-environment

please fill the custom env issue template next time and use the env checker…

@araffin Thank you very much once more. Yor’re a life saver. However, may I ask a complementary question? By considering the tip, even for discrete observation space states (using stablelines3 DQN), Should I normalize the observation state space? If yes, something link [0,1] or [-1,1].

0reactions

araffincommented, Apr 12, 2021

By considering the tip, even for discrete observation space states (using stablelines3 DQN), Should I normalize the observation state space? If yes, something link [0,1] or [-1,1].

I think you may be confusing action and observation space. But yes, for observation spaces, as mentioned in the doc, it is always a good practice to normalize it ([-1, 1], [0, 1] should not really matter).