question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Image input into TD3

See original GitHub issue

Hi,

I have a custom env with a image observation space and a continuous action space. After training TD3 policies, when I evaluate them there seems to be no reaction to the image observation (I manually drag objects in front of the camera to see what happens).

from stable_baselines.td3.policies import CnnPolicy as td3CnnPolicy
from stable_baselines import TD3

env = gym.make('GripperEnv-v0')
env = Monitor(env, log_dir)
ExperimentName = "TD3_test"
policy_kwargs = dict(layers=[64, 64])
model = TD3(td3CnnPolicy, env, verbose=1, policy_kwargs=policy_kwargs, tensorboard_log="tmp/", buffer_size=15000,
            batch_size=2200, train_freq=2200, learning_starts=10000, learning_rate=1e-3)

callback = SaveOnBestTrainingRewardCallback(check_freq=1100, log_dir=log_dir)
time_steps = 50000
model.learn(total_timesteps=int(time_steps), callback=callback)
model.save("128128/"+ExperimentName)

I can view the observation using opencv and it is the right image (single channel, pixels between 0 and 1).

So how I understand it is that the CNN is 3 conv2D layers that connect to two layers 64 wide. Is it possible that I somehow disconnected these two parts or could it be that my hyper-parameters are just that bad? The behavior that is learnt by the policies is similar to if I just put in zero pixels in the network.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:20

github_iconTop GitHub Comments

2reactions
araffincommented, May 27, 2020

Being stuck executing one action could be a sign of too hard environment / bad learning result, but I do not have such an environment at hand to test this out. @araffin Do you have any experience with this?

SAC/TD3 are very slow with images, I recommend you to do something as here or here where you decouple policy learning from feature extraction.

This does not answer completely the question, but I don’t have much time for this right now.

1reaction
tkelestemurcommented, May 28, 2020

Aren’t we supposed to give image observations as values between 0-255? I am using 2 channel images as observation and map it to 0-255 from values between 0-1. Similar to @C-monC, I have depth images as the observations and I’m getting the same problem where the agent always chooses same action no matter the observations are. Btw, I’m using A2C.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Behringer TD-3: Getting Started - Sweetwater
In this guide, we will show you how to set up, connect, and create music for the first time with a Behringer TD-3....
Read more >
Behringer TD-3 Hands On and FL Studio Integration - YouTube
Hi, in this video I show you a little bit of the functionality of the Behringer TD-3. This video is just a hands-...
Read more >
Custom Policy Network - Stable Baselines3 - Read the Docs
Stable Baselines3 provides policy networks for images (CnnPolicies), other type of input features (MlpPolicies) and multiple different inputs ...
Read more >
rlTD3Agent - MathWorks
The twin-delayed deep deterministic policy gradient (DDPG) algorithm is an actor-critic, model-free, online, off-policy reinforcement learning method which ...
Read more >
Behringer TD-3 question - Image Line forum - FL Studio
Recording: You need to connect the output of the TD-3 to your interface, and then select that input in the FL Studio mixer....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found