Prediction is dependent on other predicted data points
See original GitHub issue🐛 Bug
This might not be a bug but instead my misunderstanding of how GPyTorch works, but my understanding is that conditional on the training data (i.e. the observations), the prediction for any given data point is independent of what we ask about other data points. However, I see that asking for certain data points in prediction is adding noise to other predictions, which I assume is due to some optimization that is numerically unstable?
To reproduce
Unfortunately, I don’t have a reproducer I can share, but I can clearly describe the data and model:
model:
ExactGPModel(
(likelihood): FixedNoiseGaussianLikelihood(
(noise_covar): FixedGaussianNoise()
)
(mean_module): ConstantMean()
(covar_module): ScaleKernel(
(base_kernel): RBFKernel(
(lengthscale_prior): NormalPrior()
(raw_lengthscale_constraint): GreaterThan(5.000E+01)
(distance_module): Distance()
)
(raw_outputscale_constraint): Positive()
)
(feature_extractor): CustomLengthScaleExtractor()
)
where CustomLengthScaleExtractor
is a piece-wise monotonic function that just transforms x before it passes through the GP (I don’t think it’s super relevant to the problem, but I can explain more). Essentially this is a 1-dimensional GP with some data points (x, y)
such that all the data has x
as an integer in [-20, 627]
and y
between 0, 12
. The FixedGaussianNoise is pretty much all set to 1, with some data points having noise of 3.
** Stack trace/error message ** This is if we ask only for data points inside the range:
xpred = torch.arange(-20, 500).to(torch.float32)
with torch.no_grad(), gpytorch.settings.fast_pred_var():
# observed_pred = likelihood(model(test_x))
f_preds = gpmodel(xpred)
f_mean = f_preds.mean
mean = f_preds.mean.detach().cpu().numpy()
std = f_preds.stddev.detach().cpu().numpy()
plt.figure(figsize=(8, 6))
plt.plot(xpred.numpy(), mean, '-', color='gray')
plt.fill_between(xpred.numpy(), mean - std, mean + std,
color='gray', alpha=0.2)
plt.xlim(-5, 5)
plt.ylim(10, 11)
However when I expand the range, I get some noisy predictions:
xpred = torch.arange(-20, 1000).to(torch.float32)
with torch.no_grad(), gpytorch.settings.fast_pred_var():
# observed_pred = likelihood(model(test_x))
f_preds = gpmodel(xpred)
f_mean = f_preds.mean
mean = f_preds.mean.detach().cpu().numpy()
std = f_preds.stddev.detach().cpu().numpy()
plt.figure(figsize=(8, 6))
plt.plot(xpred.numpy(), mean, '-', color='gray')
plt.fill_between(xpred.numpy(), mean - std, mean + std,
color='gray', alpha=0.2)
plt.xlim(-5, 5)
plt.ylim(10, 11)
plt.tight_layout()
I also get the following warning (which I guess might be because of fast_pred_var
):
..site-packages/gpytorch/distributions/multivariate_normal.py:263: NumericalWarning: Negative variance values detected. This is likely due to numerical instabilities. Rounding negative variances up to 1e-06.
NumericalWarning,
Expected Behavior
I expected that as we asked for a larger range of predictions (including the earlier values), the predictions on the earlier set shouldn’t change. In other words, if I predict on A, then predict on A, B, then the two predictions for A (definitely in mean and probably in variance?) should be the same. Please let me know if I just misunderstand!
I was sure that this error starts occuring once you ask for predictions outside of the model.train_inputs
range, but it seems that it starts even before that (which in my case here is at x=627):
Of course, as you let the prediction range go out to much larger numbers, the predictions at x=0 become nonsense:
System information
Please complete the following information:
- GPyTorch version
1.4.1
- PyTorch version
1.7.1+cu101
- OS is RHEL
Issue Analytics
- State:
- Created 2 years ago
- Comments:6
Top GitHub Comments
Anyways, I’m pretty confident that what you’re seeing as the predictions changing is really the SKI grid changing in response to seeing values of
x
that are outside the range of the data.The prediction changes are probably most noticeable because your data has wide range and isn’t standardized to say [0,1] or to have zero mean.
Hi, sorry I’m not able to access the data now that I’m taking a look at this, but will try simulating some data to look roughly similar.
A priori, I’d expect that part of what’s happening is that the grid is changing (e.g. expanding if you predict at certain data points) by virtue of this set of code: https://github.com/cornellius-gp/gpytorch/blob/c074c2ff5ba5708761453bbd9be870c35cb57769/gpytorch/kernels/grid_interpolation_kernel.py#L158. In general, this should be nearly equivalent to setting the default grid bounds to be
train_x.min()
andtrain_x.max()
, but on a dimension-wise setting.