Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

GaussianProcessRegressor (predict)

See original GitHub issue

Discussed in https://github.com/scikit-learn/scikit-learn/discussions/22925

^{Originally posted by jecampagne March 23, 2022} Hello, I am questioning the code of predict of the GaussianProcessRegressor. The code is based on Alg. 2.1 of C. E. Rasmussen & C. K. I. Williams (2003). I have the 2006 version and I do not know if there have been a modification between the two versions. Well, the algorithm is based on the following formula:

Notice that the K(X,X) (ie, X=X_train) is the only one that contains the “noise” parameter while K(X*, X) and K(X*,X*) (ie, X*=X_test) do not get this additional part to the diagonal. And this is ok.

Looking now at the code it is ok for the default kernel (ie. RBF with fixed scale and length):

self.kernel_ = C(1.0, constant_value_bounds="fixed") * RBF(
                1.0, length_scale_bounds="fixed"
            )

But, if the user pass a kernel composed with the WhiteKernelas

kernel = C(1.0) * RBF() + WhiteKernel(0.5)

then it seems that K(X*, X) and K(X*,X*) will use the WhiteKernel part which is not what the Alg. 2.1 of Rasmussen & Williams does.

</div>

Issue Analytics

State:
Created a year ago
Comments:10 (6 by maintainers)

Top GitHub Comments

1reaction

elcortocommented, May 17, 2022

I set up a similar comparison between R&W textbook equations (eqns. 2.23, 2.24 which lead to Alg. 2.1), GPy, tinygp, gpytorch and sklearn. You can find it in ~this gist~ this repo. The sklearn bits are at the very end.

I think what’s happening is that

sklearn predict = GPy predict_noiseless = Alg. 2.1 whenWhiteKernel is not used, then y_cov = cov(f*) from eq. 2.24 because then you imply noise_level=0. You only have GaussianProcessRegressor(alpha=...) (default is 1e-10 by the way) whis is added to the diag during fit() in the same way as noise_level would. But because it is “small” is it usually not considered noise but a regularization (or “jitter”) param, which is odd because it has the exact same effect on the fit weights and thus on y_mean as a noise param would. This is true for most GP implementations. There seems to be a magic threshold above which people start calling it noise.
sklearn predict = GPy predict != Alg. 2.1 when WhiteKernel is used, then y_cov = cov(f*) + eye(...) * noise_level. I think that’s what R&W 2006, page 18, part “noisy predictions” refers to, which is about the only resource on this I’m aware of. Most other GP resources I’ve looked at basically discuss eqns. 2.23 and 2.24 only and then call it a day, which doesn’t exactly help to clarify things.

If kernel(X,X) is called in predict() instead of kernel(X), then it behaves as GPy predict_noiseless().

The central question regarding this issue here is whether this behavior is intended, given that the other 3 tested packages expose the distinction between the behaviors equivalent to predict vs. predict_noiseless to the user.

1reaction

elcortocommented, Apr 2, 2022

However predict() also calculates y_cov. The code does y_cov = self.kernel_(X) - V.T @ V. If the above explanation is correct, then one might need to use self.kernel_(X, X) instead, which would be K(X*, X*) in R&W 2006 (eq. 2.24, 2.26, Alg. 2.1 line 6). The same probably applies to y_std.

Top Results From Across the Web

sklearn.gaussian_process.GaussianProcessRegressor

allows prediction without prior fitting (based on the GP prior). provides an additional method sample_y(X) , which evaluates samples drawn from the GPR...

skopt.learning.GaussianProcessRegressor

allows prediction without prior fitting (based on the GP prior); · provides an additional method sample_y(X), which evaluates samples drawn from the GPR...

Why my GaussianProcessRegressor model returns Constant ...

I'm using GaussianProcessregressor from Sklearn library to make predictions. My X_train is a 2D array containing x and y coordinates ,and ...

scikit-learn - gaussian_process.GaussianProcessRegressor()

allows prediction without prior fitting (based on the GP prior); provides an additional method sample_y(X), which evaluates samples drawn from the GPR (prior...

Why do my GaussianProcessRegressor prediction results ...

If a model doesn't know what to predict, then it just predicts the mean of the data. As your predictions go to the...