question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[question] Custom environment recommendations

See original GitHub issue

Hi! I am trying to create some RL-based agents in my custom Unity ML-agents environment. I implemented all required functions in the Env but I have several questions:

  • should the environment reset itself every particular number of timesteps? In my case it is important to learn behavior from different perspectives (which are generated every each reset), yet I did not find any env.reset() calls in learn function - perhaps I should call it myself, repeat learn() call or do something else?
  • what happens when the agent reaches its target? (when the environment is “done”?) I noticed some freezing when one of the agents sends done signal, perhaps the environment should take care of such situation and reset this agent environment? Or should it ignore this fact and wait for all environments to reset? Is it somehow taken care of in the baselines lib?

For now I am using A2C algorithm with several concurrent environments. Of course, I can provide all neccessary additional information on my setup.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6

github_iconTop GitHub Comments

1reaction
ernestumcommented, Mar 25, 2019
should the environment reset itself every particular number of timesteps? 

I recommend just returning done = True after some timeout even if the target is not reached. Be sure to omit any terminal rewards in that case. Do not call reset() manually.

1reaction
araffincommented, Mar 23, 2019

Hello,

yet I did not find any env.reset() calls in learn function

It depends on which algorithm you are using. For instance, PPO2/A2C use VecEnv that reset automatically (as stated in the doc). For other algorithm, like SAC, the reset is explicit.

should take care of such situation and reset this agent environment?

I assume you are talking about VecEnv, then the answer is in the previous paragraph 😉

Btw, if you are using A2C with continuous actions, there is a bug in the current implementation that is fixed in #206 (will be merged soon, but the fix is only one line of code), I would recommend you to either use PPO2 (until it is merged) or fix the code (see commit https://github.com/hill-a/stable-baselines/pull/206/commits/689afd16f5b07d2fead1fa5e8474a8efa2826a64).

Read more comments on GitHub >

github_iconTop Results From Across the Web

custom environment that mimics gather or align - TeX
The environment I want should leave the equations center aligned if no & is present and should keep the = signs aligned if...
Read more >
Custom Environments in OpenAI's Gym | Towards Data Science
Beginner's guide on how to set up, verify, and use a custom environment in reinforcement learning training with Python.
Read more >
Reinforcement Learning with Stable Baselines 3 (P.3)
How to incorporate custom environments with stable baselines 3Text-based tutorial and sample code: ...
Read more >
Question on custom environment setup [openai-gym] - Reddit
I'm trying to design a custom environment using OpenAI Gym. Due to the lack of courses, etc., I'm reading the documents to have...
Read more >
What is a working way to set up a custom Environment in ...
The AzureML documentation recommends this, if a custom system package should be installed. For me this is Graphviz. The conda environment ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found