Custom Policy Network ---Observation dimensions
See original GitHub issue❓ Question
` def forward(self, observations: th.Tensor) -> th.Tensor:
print(observations.size())
return self.linear(self.cnn(observations))`
When the program is running, the output is ‘torch.Size([1, , ])’'for one period of time, and the output is ‘torch.Size([128, , ])’ for another period of time, and the loop continues. (batch_size is 128)Why is this so?Is it possible to make the observation dimension constant?
Checklist
- I have checked that there is no similar issue in the repo
- I have read the documentation
- If code there is, it is minimal and working
- If code there is, it is formatted using the markdown code blocks for both code and stack traces.
Issue Analytics
- State:
- Created a year ago
- Comments:5
Top Results From Across the Web
Custom Policy Network - Stable Baselines3 - Read the Docs
Custom Policy Network ¶. Stable Baselines3 provides policy networks for images (CnnPolicies), other type of input features (MlpPolicies) and multiple different ...
Read more >Understanding custom policies in stable-baselines3 - Reddit
I was trying to understand the policy networks in stable-baselines3 from this doc page.
Read more >Too many errors when customizing policy, a full example for ...
I follow the doc of stable baselines3 to customize policy network for DDPG algorithm, but it always make errors when defining DDPG model....
Read more >Publishing custom metrics - Amazon CloudWatch
Walks through how to publish your own custom metrics in CloudWatch. ... In custom metrics, the --dimensions parameter is common. A dimension further ......
Read more >Policies | TensorFlow Agents
In TF-Agents, observations from the environment are contained in a named ... Most policies have a neural network to compute actions and/or ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
So that’s what I thought. It’s not the size of the observation that changes, but the size of the input batch. It is 64 during training (
batch_size=64
), and 1 during the interaction phase.What program? Please provide a minimal and working code.
I think you are confusing the batch size and the observation size. The input tensor in the network have a size
batch_size x obs_size
. During the evaluation, only one observation is used as input of the network. Consequently, first dimension is 1.