[Question] entropy sign, once again
See original GitHub issueQuestion
I am confused with the entropy. The definition is $\text{entropy} = - \sum_i p_i \log(p_i) $, but the code reads:
# Entropy loss favor exploration
if entropy is None:
# Approximate entropy when no analytical form
entropy_loss = -th.mean(-log_prob)
else:
entropy_loss = -th.mean(entropy)
and it gives the opposite result (in terms of concavity) than using entropy as in the definition (having the factor p_i before log(p_i)). I haven’t read any papers, but I think the definition of entropy is quite well established. See different curve shapes
wikipedia definition shape ignoring p_i shape
what am I missing?
Checklist
- [x ] I have read the documentation (required)
- [ x] I have checked that there is ~no~ similar issue in the repo (required)
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (1 by maintainers)
Top Results From Across the Web
19.2: Entropy and the Second Law of Thermodynamics
Entropy (S) is a state function whose value increases with an increase in the number of available microstates.For a given system, ...
Read more >Learn How to Solve an Entropy Change Problem - ThoughtCo
The entropy of a reaction refers to the positional probabilities for each reactant. For instance, an atom in its gas phase has more...
Read more >Entropy & Enthalpy Changes | Energy Foundations for High ...
It can often help to understand it as a measure of the possible arrangements of the atoms, ions, or molecules in a substance....
Read more >Entropy as an arrow of time - Wikipedia
As one goes "forward" in time, the second law of thermodynamics says, the entropy of an isolated system can increase, but not decrease....
Read more >12.3 Second Law of Thermodynamics: Entropy - Physics
Second, once the two masses of water are mixed, there is no more temperature difference left to drive energy transfer by heat and...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
As a side remark, we have a test that ensure this estimate is not too bad: https://github.com/DLR-RM/stable-baselines3/blob/master/tests/test_distributions.py#L79 (I think I took the idea from https://github.com/openai/baselines/blob/master/baselines/common/distributions.py#L323)
Thanks very much for your detailed answers! Indeed, I missed that in the code we were using the entropy for each distribution, which makes much more sense, I thought we were using that estimate for my categorical distribution. That was my confusion.
Regarding the current estimate… I don’t know sac nor the area mentioned on the math exchange site. Definitely agree with your analysis of the current approach. I’m happy and will close this issue, but maybe a notimplementederror, a link to this issue or something else could be nice, as this is a bit obscure (I’m not certain yet that the estimator is working in the right direction, I think that the curve is inverted if we assume p=1) feel free to open again to discuss the point between parenthesis