Suggestion for kinematicObservation
See original GitHub issueHello, very nice job with this environment ! I’m currently using your work for my master thesis in Adversarial RL, it’s the perfect environment for my subject thank you !
I have a question and a suggestion about the KinematicObservation
class (with absolute=False
) :
In the observe()
function, I think there is a problem with the first row of the observation, where you add the ego-vehicle’s data.
When I run the highway-v0 env with kinematic observations, when the ego-vehicle exceed 150m in the X axe (or 5*max_speed in the general case), the value returned by the observe()
function at the first row and second column is always 1.0 until the end of the episode.
I see that it’s because the x and y values are remap from [-150,150] to [-1,1] in the normalize()
function, and then the values resulting are clipped at [-1,1].
Tell me if I am wrong, but I think the use of clipping here was chosen to express the exo-vehicle’s data which are relative to the ego-vehicles position and velocity. And the limit of 150m seems to mean “the ego-vehicle is able to see exo-vehicles at 150m distance”, but the clipping is not adapted to the first row of the observation because you are not showing anymore the information of distance of the ego-vehicle.
I think with absolute=True
there is nothing to change , because all the data are comparable (it is just important to care about features_range for X axe acording to the duration selected).
But with absolute=False
the data in the first row (ego-vehicle) is at a total other magnitude compared to the data in the other rows (exo-vehicles). And with the data of the exo-vehicles relative to the ego-vehicle the agent don’t need anymore informations about itself, the relative informations about exo-vehicles seems to be sufficient.
Just to illustrate (with absolute=False
):
this is the observation before remapping:
presence x y vx vy
0 1 251.576068 0.0 25.000000 0.0
1 1 -9.811933 8.0 -1.846148 0.0
2 1 18.651901 4.0 -0.108333 0.0
3 1 52.169854 12.0 -0.569056 0.0
4 1 75.469690 0.0 -1.198071 0.0
this is the observation before clipping:
presence x y vx vy
0 1 1.677174 0.00 0.416667 0.0
1 1 -0.065413 0.50 -0.030769 0.0
2 1 0.124346 0.25 -0.001806 0.0
3 1 0.347799 0.75 -0.009484 0.0
4 1 0.503131 0.00 -0.019968 0.0
this is the observation returnd by observe()
:
[[ 1. 1. 0. 0.41666667 0. ]
[ 1. -0.06541288 0.5 -0.03076914 0. ]
[ 1. 0.12434601 0.25 -0.00180556 0. ]
[ 1. 0.34779903 0.75 -0.00948426 0. ]
[ 1. 0.50313127 0. -0.01996785 0. ]]
So to resolve this there are several cases:
For absolute=True
do not change anything.
For absolute=False
:
-
If we keep the information about the ego-vehicle : I propose to remove the clipping at the first row of the observation (the row containing the ego-vehicles data). But this has the effect of not having all the elements of the observation between -1 and 1 anymore, and we have still the difference of magnitude between the first row and the others.
-
Or a second solution would be to remove the first row, to not return the informations about the ego-vehicle in the kinematic observation. But juste return informations about exo-vehicles.
Sorry for such a big message. And again, nice work !
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (5 by maintainers)
Top GitHub Comments
Ok yep, I did not think about vertical roads, I understand now, thank you !
Thank you but it’s not a problem for me, In the code I use in my project I already did the changes I needed. I am here to help you enhance the project !
Thank you very much, this discution was very interesting !
It is not clear what you mean by “the RNN will not understand”. In the particular case of a horizontal road, the x coordinate is not relevant, and the RNN should learn to ignore it. The fact that it raises to 1 should not be a problem, especially since it stays bounded.
I see what you mean, but this only applies to the particular case of horizontal roads. In other settings, e.g. the roundabout and intersection environment, the x coordinate is relevant for the decision and should not be set to 0.
So I think that ideally, x and y coordinates should not be treated differently in general (yes, I know it is already the case given the default observation scaling in these axis). Here is what I think is the best solution for your problem: