Definition of terms related to uncertainty
See original GitHub issueSome thoughts on “uncertainty”. This issue was inspired by @MechCoder’s comment in #9. The first part of this issue tries to correctly define various terms that often get used interchangeably and are easy to confuse (I confidently predict that I will make at least one error in this post). Once we have defined the terms, we can decide which of them we need in order to evaluate various acquisition functions.
Standard deviation (\sigma): this is the square root of the variance. Can be calculated for any sample no matter what distribution the samples come from.
Standard error (of the mean): \sigma / \sqrt(N)
a measure of the uncertainty associated with the estimated value of the mean.
Confidence interval (CI): The N% confidence interval will contain the measured value N% of the time. Alice wants to estimate the value of a parameter t
, so she constructs an estimator that
as well as a CI. The 68% CI (around that
) will contain the true value t
in 68% of experiments (that is we clone Alice and repeat what she did many times).
N% quantile: The N% quantile starts at negative infinity and goes until a point x
, think of it as the integral of the p.d.f. between -inf
and x
which equals N%.
If that
is distributed according to a normal distribution then the 68% CI is [that
- sigma
, that
+ sigma
].
For a normal distribution mu
-sigma
= the 16% quantile.
For our purposes we have a surrogate model (a GP or what have you) for the true, expensive function f
. At a given point x
our best estimate of the true value of f
is the mean mu(x)
of our surrogate model.
Now my understanding runs out -> need help.
What is the band we get from a GP and then feed into EI and friends? Is it the “standard error on the mean” or “68% confidence interval” or “68% credible interval” or something else?
Issue Analytics
- State:
- Created 7 years ago
- Comments:12 (12 by maintainers)
Might be nice to write a notebook about these concepts, that could be part of the documentation. What do you think?
Not convinced we need to do that now. To build a working proof of concept, all these empirical statistics can be computed using the
apply
method in trees.