question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

GaussianProcessRegressor (predict)

See original GitHub issue

Discussed in https://github.com/scikit-learn/scikit-learn/discussions/22925

<div type='discussions-op-text'>

Originally posted by jecampagne March 23, 2022 Hello, I am questioning the code of predict of the GaussianProcessRegressor. The code is based on Alg. 2.1 of C. E. Rasmussen & C. K. I. Williams (2003). I have the 2006 version and I do not know if there have been a modification between the two versions. Well, the algorithm is based on the following formula:

image

image

Notice that the K(X,X) (ie, X=X_train) is the only one that contains the “noise” parameter while K(X*, X) and K(X*,X*) (ie, X*=X_test) do not get this additional part to the diagonal. And this is ok.

Looking now at the code it is ok for the default kernel (ie. RBF with fixed scale and length):

self.kernel_ = C(1.0, constant_value_bounds="fixed") * RBF(
                1.0, length_scale_bounds="fixed"
            )

But, if the user pass a kernel composed with the WhiteKernelas

kernel = C(1.0) * RBF() + WhiteKernel(0.5)

then it seems that K(X*, X) and K(X*,X*) will use the WhiteKernel part which is not what the Alg. 2.1 of Rasmussen & Williams does.

</div>

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:10 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
elcortocommented, May 17, 2022

I set up a similar comparison between R&W textbook equations (eqns. 2.23, 2.24 which lead to Alg. 2.1), GPy, tinygp, gpytorch and sklearn. You can find it in ~this gist~ this repo. The sklearn bits are at the very end.

I think what’s happening is that

  • sklearn predict = GPy predict_noiseless = Alg. 2.1 whenWhiteKernel is not used, then y_cov = cov(f*) from eq. 2.24 because then you imply noise_level=0. You only have GaussianProcessRegressor(alpha=...) (default is 1e-10 by the way) whis is added to the diag during fit() in the same way as noise_level would. But because it is “small” is it usually not considered noise but a regularization (or “jitter”) param, which is odd because it has the exact same effect on the fit weights and thus on y_mean as a noise param would. This is true for most GP implementations. There seems to be a magic threshold above which people start calling it noise.
  • sklearn predict = GPy predict != Alg. 2.1 when WhiteKernel is used, then y_cov = cov(f*) + eye(...) * noise_level. I think that’s what R&W 2006, page 18, part “noisy predictions” refers to, which is about the only resource on this I’m aware of. Most other GP resources I’ve looked at basically discuss eqns. 2.23 and 2.24 only and then call it a day, which doesn’t exactly help to clarify things.

If kernel(X,X) is called in predict() instead of kernel(X), then it behaves as GPy predict_noiseless().

The central question regarding this issue here is whether this behavior is intended, given that the other 3 tested packages expose the distinction between the behaviors equivalent to predict vs. predict_noiseless to the user.

1reaction
elcortocommented, Apr 2, 2022

However predict() also calculates y_cov. The code does y_cov = self.kernel_(X) - V.T @ V. If the above explanation is correct, then one might need to use self.kernel_(X, X) instead, which would be K(X*, X*) in R&W 2006 (eq. 2.24, 2.26, Alg. 2.1 line 6). The same probably applies to y_std.

Read more comments on GitHub >

github_iconTop Results From Across the Web

sklearn.gaussian_process.GaussianProcessRegressor
allows prediction without prior fitting (based on the GP prior). provides an additional method sample_y(X) , which evaluates samples drawn from the GPR...
Read more >
skopt.learning.GaussianProcessRegressor
allows prediction without prior fitting (based on the GP prior); · provides an additional method sample_y(X), which evaluates samples drawn from the GPR...
Read more >
Why my GaussianProcessRegressor model returns Constant ...
I'm using GaussianProcessregressor from Sklearn library to make predictions. My X_train is a 2D array containing x and y coordinates ,and ...
Read more >
scikit-learn - gaussian_process.GaussianProcessRegressor()
allows prediction without prior fitting (based on the GP prior); provides an additional method sample_y(X), which evaluates samples drawn from the GPR (prior...
Read more >
Why do my GaussianProcessRegressor prediction results ...
If a model doesn't know what to predict, then it just predicts the mean of the data. As your predictions go to the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found