Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Custom Policy Network ---Observation dimensions

See original GitHub issue

❓ Question

` def forward(self, observations: th.Tensor) -> th.Tensor:

    print(observations.size())

    return self.linear(self.cnn(observations))`

When the program is running, the output is ‘torch.Size([1, , ])’'for one period of time, and the output is ‘torch.Size([128, , ])’ for another period of time, and the loop continues. （batch_size is 128）Why is this so？Is it possible to make the observation dimension constant?

Checklist

I have checked that there is no similar issue in the repo
I have read the documentation
If code there is, it is minimal and working
If code there is, it is formatted using the markdown code blocks for both code and stack traces.

Issue Analytics

State:
Created a year ago
Comments:5

Top GitHub Comments

1reaction

qgallouedeccommented, Nov 3, 2022

So that’s what I thought. It’s not the size of the observation that changes, but the size of the input batch. It is 64 during training (batch_size=64), and 1 during the interaction phase.

1reaction

qgallouedeccommented, Nov 3, 2022

When the program is running,

If code there is, it is minimal and working

What program? Please provide a minimal and working code.

I think you are confusing the batch size and the observation size. The input tensor in the network have a size batch_size x obs_size. During the evaluation, only one observation is used as input of the network. Consequently, first dimension is 1.

Top Results From Across the Web

Custom Policy Network - Stable Baselines3 - Read the Docs

Custom Policy Network ¶. Stable Baselines3 provides policy networks for images (CnnPolicies), other type of input features (MlpPolicies) and multiple different ...

Understanding custom policies in stable-baselines3 - Reddit

I was trying to understand the policy networks in stable-baselines3 from this doc page.

Too many errors when customizing policy, a full example for ...

I follow the doc of stable baselines3 to customize policy network for DDPG algorithm, but it always make errors when defining DDPG model....

Publishing custom metrics - Amazon CloudWatch

Walks through how to publish your own custom metrics in CloudWatch. ... In custom metrics, the --dimensions parameter is common. A dimension further ......

Policies | TensorFlow Agents

In TF-Agents, observations from the environment are contained in a named ... Most policies have a neural network to compute actions and/or ...