question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Clarify which policies share weights between policy and value network.

See original GitHub issue

The documentation does not state explicitly whether the default policies in common.policies share weights between the value network and the policy network. After carefully reading the code, I could deduce that the

  • LstmPolicy shares all weights between value and policy network except for the very last linear layer.
  • FeedForwardPolicy shares weights if the ‘cnn’ extractor is used but uses two entirely different data streams if the ‘mlp’ extractor is used.

Are there any justifications for this specific setup?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:18 (1 by maintainers)

github_iconTop GitHub Comments

2reactions
ernestumcommented, Dec 3, 2018

Uhhh I can hear a PR rumbling in the distance …

1reaction
ernestumcommented, Nov 27, 2018

I like the intermediary approach. It takes away the option to “rejoin” the two data streams which was probably useless in the first place. I might start to work on it when I get around to it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Difference between AlphaGo's policy network and value network
The policy network was used to reduce the breadth of the search from a node (guiding ... This time its weights were updated...
Read more >
Policy Networks vs Value Networks in Reinforcement Learning
Policy and Value Networks are used together in algorithms like Monte Carlo Tree Search to perform Reinforcement Learning.
Read more >
What exactly is meant by shared weights in convolutional ...
Shared weights basically means that the same weights is used for two layers in the model. This basically means that the same parameters...
Read more >
What is the significance of shared layers between the actor ...
I was looking into many implementations of PPO and in many of the cases the actor and critic share many layers of neural...
Read more >
Policy Evaluation Networks - arXiv
agent to generalize its value representation among different policies, by providing a policy description as input. We hypothesize that an agent ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found