question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[question] Why am I getting an unexpected activation, even before training ?

See original GitHub issue

note: I understand this is probably not the best place to ask this question. However I couldn’t find an official or recommended forum for this…

I am working on creating a custom environment and training a RL agent on it.

My environment has an action space of size 127, and interprets it as a one-hot vector: Taking the index of the highest value in the vector as an input value. For debugging, I create a bar chart, showing how many times each value has been “called”

Before training, I would expect the graph to show a roughly uniform distribution of “events”, but instead the “events” in the lower end of the action spec are massively more likely than the others

I have created a colab to explain and reproduce the “issue” here

Here is a short version, in case the colab doesn’t work

note_dims = 127
note_counters = np.zeros((note_dims))

class CustomEnv(gym.Env):
  def __init__(self):
    self.action_space = spaces.Box(-1, 1, [note_dims], dtype=np.float32);
    self.observation_space = spaces.Box(-1, 1, [10], dtype=float);
    self._state = np.zeros([10])

  def reset(self):
    self._step_count = 0
    self._state = np.zeros([10])
    return self._state;

  def render(self, *args):
    pass

  def step(self, action):
    self._step_count += 1

    # map actions from -1,1 to 0,1
    action = action * 0.5 + 0.5

    pitch = np.argmax(action)
    note_counters[pitch] += 1

    reward = 0
    isdone = self._step_count > 500
    observation = self._state

    return observation, reward, isdone, {}

# Make sure the environment is properly configured
env = CustomEnv()
check_env(env, warn=True)
# vectorize the environment
env = make_vec_env(lambda: env, n_envs=1)

# I have tried multiple model architectures here but the results are always the same
model = PPO2('MlpPolicy', env, verbose=1)

obs = env.reset()
note_counters = np.zeros((note_dims))

for step_idx in range(1000):
  action, _states = model.predict(obs)
  env.step(action)

plt.bar(np.arange(0, note_dims), note_counters)

system info default colab instance

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6

github_iconTop GitHub Comments

1reaction
araffincommented, Jun 3, 2020

My environment has an action space of size 127, and interprets it as a one-hot vector

Why do you use a Box space and not a Discrete space?

EDIT: we mention that issue in the documentation: if you use a Box then the distribution used is a Gaussian, so you won’t get uniform sampling

0reactions
PartiallyTypedcommented, Jun 3, 2020

All of this is a consequence of the distributions used.

If you take the the action vector and reverse it. You will see the same poison like distribution when using an untrained model.

Anyway this is unrelated to sb.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Exercise Induced Nausea: Why it Occurs and How to Prevent It
Even a short break from training can cause unexpected nausea. "If you take a few weeks off from a workout, your strength will...
Read more >
Why do I get an unexpected exception when activating ...
The reason this error occurs is because MATLAB R2013a requires Java 6, but macOS 10.10 and later will not invoke Java 6 by...
Read more >
Office 2010 Activation issues on Win 8 "An unexpected error has
I loaded onto my new laptop, it worked up until about a week ago. It asked for the product key and entered it....
Read more >
4 Reasons Why You Can't Activate Your Glutes
Your glutes are the largest, yet often most underused, muscle in your body. To quote Dan John, you are "sitting on a goldmine"...
Read more >
How to Activate Your Central Nervous System to ... - BarBend
The question is, how do you teach yourself — more specifically, your brain — to be fast? Is this a quality you can...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found