[Feature Request] Gaussian Mixture Distribution
See original GitHub issue🚀 Feature
Add a (diagonal) GMM distribution class to distributions.py
.
Motivation
For now, continuous action spaces are handled by the normal distribution. However, there is no guarantee (I think) that the optimal policy conforms to a Gaussian distribution (e.g. for stochastic environments where two or more options might be equally valid), and so forcing to fit to a Gaussian could lead to high bias. A GMM would overcome this through its inherent flexibility, although likely at the cost of longer convergence time.
Pitch
I want to add a GMM(Distribution)
class that will implement a Gaussian Mixture distribution for continuous spaces. Furthermore, I want to add extended classes including the squashed output and gSDE implementations. I will model each Gaussian as diagonal for mathematical/computational simplicity.
Alternatives
N/A
Additional context
N/A
### Checklist
- I have checked that there is no similar issue in the repo (required)
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
Hello,
that’s true that there is no garantee that your optimal policy will be unimodal. However, in practice, the Gaussian distribution usually gives good results. One intuition about that is that it is probably better to learn to do one thing good (unimodal) rather than learn to do many things ok (multi modal).
There have been some work around that idea though. Originally, Soft Q-Learning (SQL, the predecessor of SAC) was all about learning all those modalities. In fact, the first version of SAC uses a Gaussian Mixture distribution, but it was later on replaced by a simpler diagonal Gaussian distribution.
Link to the first version of the paper: https://arxiv.org/pdf/1801.01290v1.pdf
for the std depending on observation or not: https://github.com/hill-a/stable-baselines/issues/652
Ah I see, thanks for the answers to both questions they’re super helpful 😊