Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support dynamic action_space A(s)

See original GitHub issue

I was wondering does this library support dynamically updating action_space during agent training? I need to put constraints on my model that simply disallows specific actions given a current state.

Right now I have code inside the step function does something like the below.

def step(a):
    next_state = model(current_state, a)
    self.action_space = action_dist(next_state)

I would expect that the agent would pick up the new action space and sample it on the next iteration. But it seems like baselines grabs ahold of action_space during init and stores it. Can you point me to the place in the code where baselines samples from the action_space we create inside init? I wonder if I can make some changes to the code to allow it to dynamically update action_space based on state.

Issue Analytics

State:
Created 2 years ago
Comments:10 (2 by maintainers)

Top GitHub Comments

1reaction

araffincommented, Sep 24, 2021

Yeah stable-baselines3 does not use the action_space.sample method (it relies on its own distributions for random sampling and what have you).

SB3 uses the sample method for off-policy algorithm during “warmup phase”, not for A2C/PPO or other on-policy algorithms. I doubt this is different in other codebase.

0reactions

jeweinbcommented, Sep 24, 2021

Ok, thanks for answering my questions. I’m pretty new to designing a real world RL algo and learning a lot. I have a solution using the SB3 contrib package.