question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Gym Retro Stable Baseline ACKTR: Cannot parse tensor from proto: dtype: DT_FLOAT [BUG]

See original GitHub issue

Describe the bug Using Gym retro with ACKTR will cause a TensorFlow Error.

The error is called with this code: model = ACKTR(MlpPolicy,env,verbose=1) Error message:

tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot parse tensor from proto: dtype: DT_FLOAT
tensor_shape {
  dim {
    size: 215040
  }
  dim {
    size: 215040
  }
}
float_val: 0

         [[{{node kfac/mul}}]]

During handling of the above exception, another exception occurred:

Error Reproduction:

  • Install Anaconda

  • create environment for stable baseline

  • Install TensorFlow 1.14.0

  • Install Stable Baselines

  • Install Gym Retro

  • Run this code:

# -*- coding: utf-8 -*-
"""
Created on Thu Mar  5 08:59:03 2020

@author: MasterTrader
"""

import gym
import retro
from stable_baselines import PPO2, A2C,ACKTR
import numpy as np
from stable_baselines.common.policies import MlpPolicy, MlpLstmPolicy, MlpLnLstmPolicy, CnnLnLstmPolicy, CnnPolicy, CnnLstmPolicy
from stable_baselines.common.vec_env import SubprocVecEnv, DummyVecEnv
from stable_baselines.common import make_vec_env


class Discretizer(gym.ActionWrapper):
    """
    Wrap a gym-retro environment and make it use discrete
    actions for the Sonic game.
    """
    def __init__(self, env):
        super(Discretizer, self).__init__(env)
        buttons = ["B", "A", "MODE", "START", "UP", "DOWN", "LEFT", "RIGHT", "C", "Y", "X", "Z"]
        actions = [['LEFT'], ['RIGHT'], ['LEFT', 'DOWN'], ['RIGHT', 'DOWN'], ['DOWN'],
                   ['DOWN', 'B'], ['B']]
        self._actions = []

        """
        What we do in this loop:
        For each action in actions
            - Create an array of 12 False (12 = nb of buttons)
            For each button in action: (for instance ['LEFT']) we need to make that left button index = True
                - Then the button index = LEFT = True
            In fact at the end we will have an array where each array is an action and each elements True of this array
            are the buttons clicked.
        """
        for action in actions:
            arr = np.array([False] * 12)
            for button in action:
                arr[buttons.index(button)] = True
            self._actions.append(arr)
        self.action_space = gym.spaces.Discrete(len(self._actions))

    def action(self, a): # pylint: disable=W0221
        return self._actions[a].copy()




def main():
  
    print("Nasa Main")
    n_cpu = 10
    env = SubprocVecEnv([lambda:Discretizer(retro.make(game='Airstriker-Genesis')) for i in range(n_cpu)])    
    model = ACKTR(MlpPolicy,env,verbose=1)
    print("model",model)
    model.learn(total_timesteps=100000)
    
    
    obs = env.reset()
    
    while True:
        action,_states = model.predict(obs)
        print(action)
        obs, reward,dones,info = env.step(action)
        env.render()


if __name__ == "__main__":
    main()    
 

Log File ACKTRLog.txt

System Info Describe the characteristic of your environment:

  • Describe how the library was installed (pip, docker, source, …) PIP. TesnforFlow installed using Anaconda Navigator.
  • GPU models and configuration CPU ryzen 2700x
  • Python version Python 3.7.6 from Anaconda Navigator (1.9.12)
  • Tensorflow version 1.14.0
  • Versions of any other relevant libraries Conda list codnalist.txt

Additional context I am using Windows 10 64 bit

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7

github_iconTop GitHub Comments

1reaction
Miffylicommented, Mar 5, 2020

The issue lies in the policy you selected: MlpPolicy treats the input as 1D vector, which in this case is huuuuge, as the environment returns images. Changing to CnnPolicy runs the code, but easily runs out of memory (the images are still big, so you need resizing and whatnot). I also recommend trying out simpler algorithms first (A2C, PPO). ACKTR and ACER are more complex and slower to run.

0reactions
toksiscommented, Mar 6, 2020

thank you. I will look into that. maybe in the future , model.learn(render= yes) as parameter?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Training using Stable Baselines and Open AI Gym Retro
Hi, I'm having trouble implementing the ACKTR algorithm from Stable Baselines into Retro environment because of the way that Stable ...
Read more >
Installation — Stable Baselines 2.10.3a0 documentation
Stable -Baselines supports Tensorflow versions from 1.8.0 to 1.15.0, and does not work on Tensorflow versions 2.0.0 and above. PyTorch support is done...
Read more >
convert stable-baselines tensorflow model to tensorflowjs
I created the following colab notebook with the error so you can try it. Does anyone knows how to make this conversion work?...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found