Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question] Suport to unfeasible regions of the domain

See original GitHub issue

Important Note: We do not do technical support, nor consulting and don’t answer personal questions per email. Please post your question on the RL Discord, Reddit or Stack Overflow in that case.

Question

My questions concerns the environments where not every action is feasible to be executed in control time by the agent, i.e my space action normalized is [-1, 1], however it’s possible that only [-0.4, 0.4] is feasible in a given time, or only [-0.1, 0.2] is feasible (not necessarily symmetric). When I try to train my env using the stable baselines3 the actions received by my environment are way out of scope of [-1, 1], and I think this problem is caused by my feasible domain not being fixed or symmetrical.

How should I deal with this issue? Because this limitation is linked to a physical limitation which I have no control.

Additional context

Example of feasible action(actions already normalized to [-1, 1]) for given timesteps:

1

In [10]: env.action_space.low
Out[10]: 
array([-1.        , -0.4101973 , -0.5177778 , -0.90388674, -1.        ],
      dtype=float32)

In [11]: env.action_space.high
Out[11]: 
array([1.        , 1.        , 1.        , 0.8961133 , 0.45055556],
      dtype=float32)

2

In [18]: env.action_space.low
Out[18]: 
array([-1.        , -0.35272583, -0.90092593, -0.09008063, -1.        ],
      dtype=float32)

In [19]: env.action_space.high
Out[19]: 
array([0.52294207, 1.        , 1.        , 1.        , 0.37092593],
      dtype=float32)

Checklist

I have read the documentation (required)
I have checked that there is no similar issue in the repo (required)

Issue Analytics

State:
Created 3 years ago
Comments:7 (3 by maintainers)

Top GitHub Comments

2reactions

araffincommented, Mar 18, 2021

I’m just concerned with the fact that the actions used as input in the neural network will not be able to be mapped to the actions of the environment, because of the transient nature of the actions feasible space.

hmm, if your environment is still Markovian, it should not be an issue. For instance, if you use SAC, this algorithm uses a Q-value function, which is Q(state, action) so that the value depends on the state and action.

I will make some experiments with this hypothesis of normalizing the feasible space to [-1, 1]

if you mean by that having fixed “exposed” action space with limits [-1, 1] and then changing the limits inside the env, this should work.

1reaction

PatrickSampaioUSPcommented, Mar 21, 2021

@araffin We conducted some experiments with your suggestion and the results were very good, thanks for you help!

Top Results From Across the Web

why getting infeasible solutions using fmincon? (Update)

I have an optimization problem with many optimization variables and constraints (~561 var). I tried to solve the problem with fmincon but it...

Feasible region - Wikipedia

In mathematical optimization, a feasible region, feasible set, search space, or solution space is the set of all possible points of an optimization...

What does it mean, geometrically, that a linear program is ...

Yes you are exactly right. Geometrically, it means that the set of points that satisfy all of the constraints is empty.

Infeasible Path Detection Based on Code Pattern ... - Hindawi

This paper sets out to reveal the relationship between code pattern and infeasible paths and gives advices to the selection of infeasible path...

Answer true or false: In a linear program, an infeasible ...

See how to graph the feasible region for a system of inequalities and what this region represents. Related to this Question. Answer true...