DictReplayBuffer: TypeError: 'dict' object cannot be interpreted as an integer
See original GitHub issueI am using DDPG with HER on a custom env. The error happened here, with
self.obs_shape.items() = dict_items([('achieved_goal', (3,)), ('desired_goal', (3,)), ('observation', (64, 64, 4))]), which is also printed below
self.observations = {
key: np.zeros((self.buffer_size, self.n_envs) + _obs_shape, dtype=observation_space[key].dtype)
for key, _obs_shape in self.obs_shape.items()
}
I printed the self.obs_shape.items() as above, where the error happened I also tried to reproduce the dict in a way like this with the same parameters and it worked.
Observation space:
@classmethod
def getObservationSpace(cls):
"""Return observation_space for gym Env class.
"""
return gym.spaces.Dict(dict(desired_goal=gym.spaces.Box(-np.inf, np.inf, shape=(3,), dtype=np.float32),achieved_goal=gym.spaces.Box(-np.inf, np.inf, shape=(3,), dtype=np.float32),observation=gym.spaces.Box(low=0.0, high=1.0, shape=(Robot.CAMERA_PIXEL_HEIGHT, Robot.CAMERA_PIXEL_WIDTH, 4), dtype=np.float32),))
def reset(self):
"""Reset environment.
"""
self.steps = 0
self.episode_rewards = 0
p.resetSimulation()
# p.setTimeStep(1.0 / 240.0)
p.setGravity(0, 0, self.GRAVITY)
self.plane_id = p.loadURDF('plane.urdf')
self.robot = self.robot_model()
self.object_ids = []
for i, (pos, orn) in enumerate(self._generateObjectPositions(num=(self.num_foods+self.num_fakes), radius_scale=self.object_radius_scale, radius_offset=self.object_radius_offset, angle_scale=self.object_angle_scale)):
if i < self.num_foods:
urdfPath = 'food_sphere.urdf'
self.goal = pos
else:
urdfPath = 'food_cube.urdf'
object_id = p.loadURDF(urdfPath, pos, orn, globalScaling=self.object_size)
self.object_ids.append(object_id)
for i in range(self.BULLET_STEPS):
p.stepSimulation()
obs = self._getObservation()
#self.robot.printAllJointInfo()
return obs
def step(self, action):
"""Apply action to environment, then return observation and reward.
"""
self.steps += 1
self.robot.setAction(action)
#reward = -1.0 * float(self.num_foods) / float(self.max_steps) # so agent needs to eat foods quickly
reward=0
obs = self._getObservation() #get obs first
done = False
info = {
'is_success': self.issuccess(obs['achieved_goal'], self.goal),
}
for i in range(self.BULLET_STEPS):
p.stepSimulation()
reward += self.compute_reward(obs['achieved_goal'], self.goal, info)
self.episode_rewards += reward
Code for running
import argparse
import gym
import gym_foodhunting
import numpy as np
import time
import tensorflow
import torch
from stable_baselines3 import DDPG, DQN, SAC, TD3
from stable_baselines3.her.her_replay_buffer import HerReplayBuffer
from stable_baselines3.her.goal_selection_strategy import GoalSelectionStrategy
from stable_baselines3.td3.policies import CnnPolicy, MlpPolicy, MultiInputPolicy
from stable_baselines3.common.vec_env import DummyVecEnv
from stable_baselines3.common.env_checker import check_env
model_class = DDPG
def learn(env_name, save_file, total_timesteps):
#env =make_robotics_env(env_name,seed=0)
env = gym.make(env_name)
check_env(env)
#env = HERGoalEnvWrapper(env)
goal_selection_strategy = 'future'
online_sampling = True
replay_buffer_class=HerReplayBuffer
replay_buffer_kwargs=dict(
n_sampled_goal=4,
goal_selection_strategy=goal_selection_strategy,
online_sampling=online_sampling,
max_episode_length=80,
)
model = model_class(
'MultiInputPolicy',
env,
replay_buffer_class,
# Parameters for HER
replay_buffer_kwargs,
verbose=1,
)
#env = SubprocVecEnv([make_env(env_name, i, seed=0) for i in range(8)])
model.learn(total_timesteps=total_timesteps)
model.save(save_file)
del model
env.close()
Traceback:
pybullet build time: Jul 4 2021 01:27:40
2021-07-04 01:39:22.108965: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
/usr/local/lib/python3.7/dist-packages/gym/logger.py:30: UserWarning: WARN: Box bound precision lowered by casting to float32
warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/common/env_checker.py:27: UserWarning: It seems that your observation is an image but the `dtype` of your observation_space is not `np.uint8`. If your observation is not an image, we recommend you to flatten the observation to have only a 1D vector
f"It seems that your observation {key} is an image but the `dtype` "
/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/common/env_checker.py:35: UserWarning: It seems that your observation space is an image but the upper and lower bounds are not in [0, 255]. Because the CNN policy normalize automatically the observation you may encounter issue if the values are not in that range.
f"It seems that your observation space {key} is an image but the "
Using cuda device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.
dict_items([('achieved_goal', (3,)), ('desired_goal', (3,)), ('observation', (64, 64, 4))])
Traceback (most recent call last):
File "examples/example_rl1.py", line 77, in <module>
learn(args.env_name, args.filename, args.total_timesteps)
File "examples/example_rl1.py", line 39, in learn
verbose=1,
File "/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/ddpg/ddpg.py", line 115, in __init__
self._setup_model()
File "/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/td3/td3.py", line 125, in _setup_model
super(TD3, self)._setup_model()
File "/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/common/off_policy_algorithm.py", line 218, in _setup_model
**self.replay_buffer_kwargs,
File "/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/common/buffers.py", line 510, in __init__
for key, _obs_shape in self.obs_shape.items()
File "/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/common/buffers.py", line 510, in <dictcomp>
for key, _obs_shape in self.obs_shape.items()
TypeError: 'dict' object cannot be interpreted as an integer
I made adjustment based on the warnings from check_env, same error happened, so it seemed that the error is not from custom env.
Issue Analytics
- State:
- Created 2 years ago
- Comments:11 (3 by maintainers)
Top Results From Across the Web
'Index' object cannot be interpreted as an integer Error - Stack ...
You are passing corr.columns as the second argument to range , and range requires integers, not an Index object (which is the type...
Read more >Weird error on login - 'dict' object cannot be interpreted as an ...
Everytime I need to use the Azure CLI (e.g.: az login) using the serviceprincipal, I need to delete the .azure directory to the...
Read more >Error: TypeError: 'EnvContext' object cannot be interpreted as ...
However, when running a training algorithm on it, I got the following error “TypeError: 'EnvContext' object cannot be interpreted as an integer”.
Read more >TypeError float object cannot be interpreted as an integer
This error is common when you try to use a floating-point number in a range() statement. The range function does not work with...
Read more >pop a dict from a list of dicts : r/learnpython - Reddit
TypeError : 'list' object cannot be interpreted as an integer. Any help is appreciated. 2.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@JimLiu1213 Unfortunately we do not offer custom tech support like this, but I think the issue lies in using images with HER (the replay buffer takes a lot of space). You can either upgrade and get more memory, or you could try using a smaller replay buffer size.
Thank you… Also, when running with my custom env with DDPG+HER, this showed up. For my env, with space: dict_items([(‘achieved_goal’, (3,)), (‘desired_goal’, (3,)), (‘observation’, (64, 64, 4))]), observaation is a image. I don’t think this could take more than the 12GB RAM that colab provide? I run the multi_input_env with the same code, and it worked. But not my env Could you please give me some hint about the issue, cause I have googled a lot without any answer. Do I need to upgrade to Colab pro?