[Feature Request] different activation functions in the network_architecture through the policy_kwargs
See original GitHub issue🚀 Feature
Introduce the possibility of passing multiple activation functions to the policy newtork using the policy_kwargs.
Motivation
From what I understand, through the policy_kwargs it is possible to pass an activation function to be used by the net_arch part of the policy network. Oftentimes, though, the policy net (pi) and value function net (vf) need different activation functions.
It looks like the only way to have different activation functions in these two sub-networks is to implement our own policy network, as shown in the advanced example here in your documentation. This is mentioned also in this issue #481
Alternatives
Ideally it would be possible to have multiple activation functions as follows: one for the shared layers and one for each of the layers of the two sub-networks (policy net (pi) and value net (vf)), mimicking how the architecture is passed.
The architecture is passed this way: [<shared layers>, dict(vf=[<non-shared value network layers>], pi=[<non-shared policy network layers>])] (source: here), so I think it would be possible to use the same structure, but using PyTorch’s activation functions instead of integers.
Example:
from torch.nn import ReLU, Softmax, Tanh
model = A2C('MultiInputPolicy', env,
policy_kwargs=dict(
net_arch=[256, dict(pi=[128, 50], vf=[32, 1])],
activation_fn=[Tanh, dict(pi=[ReLU, Softmax], vf=[ReLU, ReLU])]
)
)
Issue Analytics
- State:
- Created a year ago
- Comments:5 (5 by maintainers)

Top Related StackOverflow Question
yes
@araffin ok so, if I understood correctly, the last layer of the policy net for discrete actions has automatically a softmax activation function, then the one I put in the
policy_kwargsis used in all the other layers of both the policy net and the value net. Is that correct?(I’ll try to work on a draft PR for that feature anyway!)