[Question] agent got stucked always in the action space lower bound
See original GitHub issueImportant Note: We do not do technical support, nor consulting and don’t answer personal questions per email. Please post your question on the RL Discord, Reddit or Stack Overflow in that case.
Question
Hi everyone,
When I use the following space actions configurations, i sample vector action with different values. But I created a custom gym Environment with the same action space configuration. I set up the baselines3 by using A2C or PPO, but for all sampled actions it seems the agent got stucked always in the action space lower bound (5.0). I was expecting something like:
21.08086 20.020802 16.812733 23.77745 10.687413 20.424904 15.4278145 26.068079 18.092493 22.096527 ] [ 5.002933 8.210208 15.631343 5.3958955 29.201706 27.193197 21.82524 25.94392 33.925514 30.831163 ]
What am I doing wrong?
`import numpy as np from gym import spaces
low_actions = [] high_actions = [] for _ in range(10): low_actions.append(5.0) high_actions.append(35.0)
action_space = spaces.Box(low=np.array(low_actions), high=np.array(high_actions)) # steer, gas, brake for _ in range(1000): print(action_space.sample())`
Additional context
Add any other context about the question here.
Checklist
- I have read the documentation (required)
- I have checked that there is no similar issue in the repo (required)
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (2 by maintainers)
Top GitHub Comments
@araffin Thank you very much once more. Yor’re a life saver. However, may I ask a complementary question? By considering the tip, even for discrete observation space states (using stablelines3 DQN), Should I normalize the observation state space? If yes, something link [0,1] or [-1,1].
I think you may be confusing action and observation space. But yes, for observation spaces, as mentioned in the doc, it is always a good practice to normalize it ([-1, 1], [0, 1] should not really matter).
if your issue is solved, yes.