[Question] Non-shared features extractor in on-policy algorithm
See original GitHub issueQuestion
I’ve checked the docs (custom policy -> advanced example), but it is not clear to me how to create a custom policy without sharing the features extractor between the actor and the critic networks in on-policy algorithms.
If I pass a features_extractor_class in the policy_kwargs, this is shared by default I think.
I can have a non-shared mlp_extractor by implementing my own _build_mlp_extractor method in my custom policy and creating a network with 2 distinct sub-networks (self.policy_net and self.value_net), but I didn’t understand how to do the same with the features extractor.
On the docs (custom policy -> custom features extractor), it says:
Therefore, since I’m using A2C, I think it should be possible to have a non-shared features extractor by implementing my own policy, just I didn’t understand how to do it.
Thanks in advance any clarification!
Checklist
- I have read the documentation (required)
- I have checked that there is no similar issue in the repo (required)
Issue Analytics
- State:
- Created a year ago
- Comments:8 (6 by maintainers)

Top Related StackOverflow Question
@wlxer I think you could pass the dimensions as parameters to your policy network (not necessarily within
kwargs, but explicitly). Then you “save” them in some net’s attributes and only then you call the superclass’ constructor. It is something that I actually do in my code, but I didn’t report it previously because it was just a personal need.You can do something a bit like this:
I managed to make it run without errors! 🎉
But since I haven’t found a guide/demo nor a similar issue here, I’ll briefly explain how I did it:
ActorCriticPolicy).forward,extract_features,evaluate_actionsandpredict_values).Quick demo
Hope it can help someone!