Gym Retro Stable Baseline ACKTR: Cannot parse tensor from proto: dtype: DT_FLOAT [BUG]
See original GitHub issueDescribe the bug Using Gym retro with ACKTR will cause a TensorFlow Error.
The error is called with this code:
model = ACKTR(MlpPolicy,env,verbose=1)
Error message:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot parse tensor from proto: dtype: DT_FLOAT
tensor_shape {
dim {
size: 215040
}
dim {
size: 215040
}
}
float_val: 0
[[{{node kfac/mul}}]]
During handling of the above exception, another exception occurred:
Error Reproduction:
-
Install Anaconda
-
create environment for stable baseline
-
Install TensorFlow 1.14.0
-
Install Stable Baselines
-
Install Gym Retro
-
Run this code:
# -*- coding: utf-8 -*-
"""
Created on Thu Mar 5 08:59:03 2020
@author: MasterTrader
"""
import gym
import retro
from stable_baselines import PPO2, A2C,ACKTR
import numpy as np
from stable_baselines.common.policies import MlpPolicy, MlpLstmPolicy, MlpLnLstmPolicy, CnnLnLstmPolicy, CnnPolicy, CnnLstmPolicy
from stable_baselines.common.vec_env import SubprocVecEnv, DummyVecEnv
from stable_baselines.common import make_vec_env
class Discretizer(gym.ActionWrapper):
"""
Wrap a gym-retro environment and make it use discrete
actions for the Sonic game.
"""
def __init__(self, env):
super(Discretizer, self).__init__(env)
buttons = ["B", "A", "MODE", "START", "UP", "DOWN", "LEFT", "RIGHT", "C", "Y", "X", "Z"]
actions = [['LEFT'], ['RIGHT'], ['LEFT', 'DOWN'], ['RIGHT', 'DOWN'], ['DOWN'],
['DOWN', 'B'], ['B']]
self._actions = []
"""
What we do in this loop:
For each action in actions
- Create an array of 12 False (12 = nb of buttons)
For each button in action: (for instance ['LEFT']) we need to make that left button index = True
- Then the button index = LEFT = True
In fact at the end we will have an array where each array is an action and each elements True of this array
are the buttons clicked.
"""
for action in actions:
arr = np.array([False] * 12)
for button in action:
arr[buttons.index(button)] = True
self._actions.append(arr)
self.action_space = gym.spaces.Discrete(len(self._actions))
def action(self, a): # pylint: disable=W0221
return self._actions[a].copy()
def main():
print("Nasa Main")
n_cpu = 10
env = SubprocVecEnv([lambda:Discretizer(retro.make(game='Airstriker-Genesis')) for i in range(n_cpu)])
model = ACKTR(MlpPolicy,env,verbose=1)
print("model",model)
model.learn(total_timesteps=100000)
obs = env.reset()
while True:
action,_states = model.predict(obs)
print(action)
obs, reward,dones,info = env.step(action)
env.render()
if __name__ == "__main__":
main()
Log File ACKTRLog.txt
System Info Describe the characteristic of your environment:
- Describe how the library was installed (pip, docker, source, …) PIP. TesnforFlow installed using Anaconda Navigator.
- GPU models and configuration CPU ryzen 2700x
- Python version Python 3.7.6 from Anaconda Navigator (1.9.12)
- Tensorflow version 1.14.0
- Versions of any other relevant libraries Conda list codnalist.txt
Additional context I am using Windows 10 64 bit
Issue Analytics
- State:
- Created 4 years ago
- Comments:7
Top Results From Across the Web
Training using Stable Baselines and Open AI Gym Retro
Hi, I'm having trouble implementing the ACKTR algorithm from Stable Baselines into Retro environment because of the way that Stable ...
Read more >Installation — Stable Baselines 2.10.3a0 documentation
Stable -Baselines supports Tensorflow versions from 1.8.0 to 1.15.0, and does not work on Tensorflow versions 2.0.0 and above. PyTorch support is done...
Read more >convert stable-baselines tensorflow model to tensorflowjs
I created the following colab notebook with the error so you can try it. Does anyone knows how to make this conversion work?...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The issue lies in the policy you selected:
MlpPolicy
treats the input as 1D vector, which in this case is huuuuge, as the environment returns images. Changing toCnnPolicy
runs the code, but easily runs out of memory (the images are still big, so you need resizing and whatnot). I also recommend trying out simpler algorithms first (A2C, PPO). ACKTR and ACER are more complex and slower to run.thank you. I will look into that. maybe in the future , model.learn(render= yes) as parameter?