FixedNoiseGaussianLikelihood results in negative variance
See original GitHub issueHi,
I am trying to train a most likely heteroscedastic GP (from “Most likely heteroscedastic GP regression”, Kersting et al. 2007). To this end I am using the likelihood FixedNoiseGaussianLikelihood. I am setting the noise to positive values r
.
(Pdb) print(r)
tensor([0.0086, 0.0071, 0.0071, 0.0069, 0.0067, 0.0067, 0.0065, 0.0065, 0.0065,
0.0065, 0.0065, 0.0065, 0.0065, 0.0066, 0.0066, 0.0066, 0.0070, 0.0076,
0.0076, 0.0090, 0.0107, 0.0110, 0.0117, 0.0122, 0.0130, 0.0135, 0.0140,
0.0154, 0.0202, 0.0208, 0.0218, 0.0226, 0.0229, 0.0265, 0.0270, 0.0280,
0.0282, 0.0285, 0.0287, 0.0289, 0.0290, 0.0290, 0.0289, 0.0288, 0.0287,
0.0276, 0.0269, 0.0241, 0.0219, 0.0218, 0.0191, 0.0177, 0.0175, 0.0166,
0.0164, 0.0160, 0.0146, 0.0145, 0.0131, 0.0120, 0.0112, 0.0105, 0.0100,
0.0094, 0.0082, 0.0081, 0.0079, 0.0072, 0.0072, 0.0070, 0.0070, 0.0059,
0.0058, 0.0057, 0.0057, 0.0056, 0.0052, 0.0051, 0.0049, 0.0049, 0.0048,
0.0048, 0.0045, 0.0045, 0.0044, 0.0043, 0.0043, 0.0042, 0.0042, 0.0042,
0.0042, 0.0042, 0.0042, 0.0042, 0.0042, 0.0042, 0.0045, 0.0046, 0.0046,
0.0047, 0.0050, 0.0051, 0.0059, 0.0067, 0.0086, 0.0104, 0.0115, 0.0161,
0.0213, 0.0226, 0.0278, 0.0399, 0.0413, 0.0418, 0.0463, 0.0567, 0.0592,
0.2299, 0.2421, 0.2920, 0.3486, 0.5690, 0.7409, 0.8167, 1.3840, 1.4557,
1.3335, 1.2206, 1.1017, 0.8272])
lik_3 = FixedNoiseGaussianLikelihood(noise=r, learn_additional_noise=False)
GP3 = ExactGPModel(self.train_x,self.train_y,lik_3)
GP3, lik_3 = train_a_GP(GP3,self.train_x,self.train_y,lik_3,self.training_iter)
where train_a_GP
is simply the following training function (copied from a GPytorch regression tutorial):
def train_a_GP(model, train_x, train_y, likelihood, training_iter):
# train GP_model for training_iter iterations
model.train()
likelihood.train()
# Use the adam optimizer
optimizer = torch.optim.Adam([
{'params': model.parameters()}, # Includes GaussianLikelihood parameters
], lr=0.1)
# "Loss" for GPs - the marginal log likelihood
mll = ExactMarginalLogLikelihood(likelihood, model)
for i in range(training_iter):
# Zero gradients from previous iteration
optimizer.zero_grad()
# Output from model
output = model(train_x)
# Calc loss and backprop gradients
loss = -mll(output, train_y)
loss.backward()
print('Iter %d/%d - Loss: %.3f lengthscale: %.3f' % (
i + 1, training_iter, loss.item(),
model.covar_module.base_kernel.lengthscale.item(),
))
optimizer.step()
model.eval()
likelihood.eval()
return model, likelihood
However when I try to obtain predictions, the variance of the MultivariateNormal
returned seems to be negative.
GP3.eval()
lik_3.eval()
with torch.no_grad(), gpytorch.settings.fast_pred_var():
train_pred = lik_3(GP3(self.train_x),noise=r)
(Pdb) train_pred.variance
tensor([-0.1621, -0.1650, -0.1650, -0.1652, -0.1653, -0.1653, -0.1651, -0.1651,
-0.1650, -0.1650, -0.1648, -0.1648, -0.1647, -0.1644, -0.1642, -0.1642,
-0.1631, -0.1619, -0.1618, -0.1586, -0.1548, -0.1542, -0.1528, -0.1515,
-0.1497, -0.1487, -0.1475, -0.1444, -0.1334, -0.1320, -0.1295, -0.1279,
-0.1271, -0.1192, -0.1182, -0.1162, -0.1159, -0.1155, -0.1152, -0.1150,
-0.1152, -0.1159, -0.1160, -0.1167, -0.1168, -0.1206, -0.1226, -0.1303,
-0.1358, -0.1360, -0.1424, -0.1454, -0.1460, -0.1479, -0.1483, -0.1493,
-0.1522, -0.1524, -0.1552, -0.1574, -0.1590, -0.1602, -0.1611, -0.1622,
-0.1642, -0.1645, -0.1648, -0.1660, -0.1660, -0.1662, -0.1663, -0.1680,
-0.1682, -0.1683, -0.1684, -0.1684, -0.1691, -0.1693, -0.1696, -0.1696,
-0.1697, -0.1698, -0.1701, -0.1702, -0.1703, -0.1704, -0.1705, -0.1706,
-0.1707, -0.1707, -0.1707, -0.1707, -0.1707, -0.1708, -0.1707, -0.1707,
-0.1705, -0.1704, -0.1703, -0.1702, -0.1700, -0.1699, -0.1690, -0.1681,
-0.1660, -0.1638, -0.1626, -0.1569, -0.1504, -0.1486, -0.1416, -0.1242,
-0.1221, -0.1214, -0.1145, -0.0983, -0.0943, 0.2122, 0.2350, 0.3277,
0.4312, 0.8048, 1.0528, 1.1486, 1.5613, 1.5205, 1.2962, 1.1957,
1.1100, 0.8646])
What am I doing wrong? Any help would be greatly appreciated.
Thanks a lot!
Miguel
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:17 (4 by maintainers)
Top Results From Across the Web
gpytorch.likelihoods
_OneDimensionalLikelihood to reduce the variance when computing approximate GP objective functions. (Variance reduction is accomplished by using 1D ...
Read more >What do negative variances indicate? | AccountingCoach
The negative variances, which are unfavorable in terms of a company's profits, are usually presented in parentheses. On the other hand, positive variances ......
Read more >AMOS states that some model variance values are negative
At times I get a message from Amos which states that I have negative variances in my model and that my solution is...
Read more >Variational GPs w/ Multiple Outputs — GPyTorch 1.8.1 documentation
The result will be a standard MultivariateNormal distribution – where each output corresponds to each input's specified task. This is similar to the...
Read more >Negative variance result when calculating standard deviation
When calculating my variance, the result turned out to be a negative number, which means that the standard deviation cannot be a realistic...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@mgarort Okay, I’m pretty sure I know what’s going on here. It’s actually pretty technical.
Basically, for fast predictive variances we decompose
(K+\sigma^{2} I)^{-1}
in a way that is fine because the added noise doesn’t change the eigenvalue clustering, it only shifts the whole spectrum. In the heteroscedastic noise setting, this is violated in the sense that adding an arbitrary diagonal component does change the eigenvalue clustering. Turning off fast predictive variances gives positive variances.To work around this, we could instead decompose
K
, and then use a QR decomposition to effectively get a root forK^{-1}
. This will take a bit to implement. For now, is turning offfast_pred_var
a feasible work around, or do you anticipate having too much data?Another related Issue #1840