[Feature Request] different activation functions in the network_architecture through the policy_kwargs
See original GitHub issue🚀 Feature
Introduce the possibility of passing multiple activation functions to the policy newtork using the policy_kwargs
.
Motivation
From what I understand, through the policy_kwargs
it is possible to pass an activation function to be used by the net_arch
part of the policy network. Oftentimes, though, the policy net (pi
) and value function net (vf
) need different activation functions.
It looks like the only way to have different activation functions in these two sub-networks is to implement our own policy network, as shown in the advanced example here in your documentation. This is mentioned also in this issue #481
Alternatives
Ideally it would be possible to have multiple activation functions as follows: one for the shared layers and one for each of the layers of the two sub-networks (policy net (pi
) and value net (vf
)), mimicking how the architecture is passed.
The architecture is passed this way: [<shared layers>, dict(vf=[<non-shared value network layers>], pi=[<non-shared policy network layers>])]
(source: here), so I think it would be possible to use the same structure, but using PyTorch’s activation functions instead of integers.
Example:
from torch.nn import ReLU, Softmax, Tanh
model = A2C('MultiInputPolicy', env,
policy_kwargs=dict(
net_arch=[256, dict(pi=[128, 50], vf=[32, 1])],
activation_fn=[Tanh, dict(pi=[ReLU, Softmax], vf=[ReLU, ReLU])]
)
)
Issue Analytics
- State:
- Created a year ago
- Comments:5 (5 by maintainers)
Top GitHub Comments
yes
@araffin ok so, if I understood correctly, the last layer of the policy net for discrete actions has automatically a softmax activation function, then the one I put in the
policy_kwargs
is used in all the other layers of both the policy net and the value net. Is that correct?(I’ll try to work on a draft PR for that feature anyway!)