Cannot evaluate if trained using more than 1 env [Custom env (Unity)]
See original GitHub issueHi,
I’m using custom environment created using UnityEnvironment
and UnityToGymWrapper
. To create env I adjusted code from cmd_util.py
. It looks like this:
def make_unity_env(env_directory, num_env, visual, log_path, start_index=0):
def make_env(rank):
def _init():
engine_configuration_channel = EngineConfigurationChannel()
unity_env = UnityEnvironment(env_directory, worker_id=rank,
side_channels=[engine_configuration_channel])
env = UnityToGymWrapper(unity_env, uint8_visual=True, flatten_branched=True)
engine_configuration_channel.set_configuration_parameters(time_scale=3.0, width=84, height=84,
quality_level=0)
env = Monitor(env, os.path.join(log_path, str(rank)) if log_path is not None else None,
allow_early_resets=True)
return env
return _init
if visual:
return SubprocVecEnv([make_env(i + start_index) for i in range(num_env)])
else:
pass
I’m using visual observations with size 84x84. I have problem when I’m using more than 1 environment (num_env
). Training works (results are not perfect, but there is clear progress in comparison with random agent), but then if I want to evaluate model on single env (or use EvalCallback
) I get this error:
Traceback (most recent call last):
File "(...)/miniconda3/envs/stable_baselines/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "(...)/miniconda3/envs/stable_baselines/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1156, in _run
(np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (1, 84, 84, 1) for Tensor 'input/Ob:0', which has shape '(8, 84, 84, 1)'
This particular model was trained using 8 envs, and basically error is the same for other env count (except when using single env):
Cannot feed value of shape (1, 84, 84, 1) for Tensor 'input/Ob:0', which has shape '(<NUM_ENV>, 84, 84, 1)'
I’ve tried changing SubprocVecEnv
to DummyVecEnv
, but the error was the same. I’ve tested same scenario using CartPole gym and it was working fine.
System Info
- stable_baselines 2.10.1 installed from pip
- GPU: GTX 1070Ti
- Python 3.7 (miniconda)
- Tensorflow gpu 1.15 installed from conda
Issue Analytics
- State:
- Created 3 years ago
- Comments:5
Top Results From Across the Web
Issue with env.set_actions - Unity Forum
I'm trying to use a custom q-learning algorithm to train my agents, using advice from the getting-started notebook.
Read more >Reinforcement Learning in Unity - Custom Environment
I show you how to create a custom environment where a cat learns to chase a mouse around in Unity3D using MLAgents-learn.
Read more >PPO algorithm with custom RL environment made with Unity ...
Using ML-Agents Python lower-level APIs to training RL agents with PPO Algorithm.
Read more >Manage Python environments and interpreters - Visual Studio ...
Use the Python Environments window to manage global, virtual, and conda environments. Install Python interpreters and packages and assign ...
Read more >Class Agent | ML Agents | 2.0.1 - Unity - Manual
An agent is an actor that can observe its environment, decide on the best ... To use a step limit when training while...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
That’d alleviate headache of many users so sounds like a good suggestion 😃. Feel free to create a PR for it unless @araffin has anything against this.
Ok, now it works, thank you 😃 What do you think about extending default eval callback so that it check
self.model.policy.recurrent
prop and based on that it completes observations with zeros (based on value fromself.model.n_envs
)? I’m open to create PR with that if you are interested