question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Converting sklearn GP code to GPyTorch

See original GitHub issue

This is a request for documentation. I’m trying to convert some GP code using sklearn to equivalent code in GPyTorch. Here is a simple example of some sklearn code:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.gaussian_process.kernels import Matern
from sklearn.gaussian_process import GaussianProcessRegressor

gp = GaussianProcessRegressor(
    kernel=Matern(nu=2.5, length_scale=1),
    alpha=0,
    optimizer=None
)

train_x = np.linspace(-2, 2, 20).reshape(-1 ,1)
train_y = np.sin(train_x).ravel()

gp.fit(train_x, train_y)
test_x = np.linspace(-4, 4, 200).reshape(-1, 1)
test_y = np.sin(test_x).ravel()
pred_y = gp.predict(test_x)

plt.plot(test_x, test_y, label="True")
plt.plot(test_x, pred_y, label="Pred")
plt.scatter(train_x, train_y, label="Obs")
plt.legend()

This generates the following figure:

image

And here is my attempt to replicate this in GPyTorch:

import numpy as np
import matplotlib.pyplot as plt
import torch
import gpytorch

# this is the model from the GP regression tutorial but with
# the mean/kernel changed to try to match those above

class ExactGPModel(gpytorch.models.ExactGP):
    def __init__(self, train_x, train_y, likelihood):
        super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
        self.mean_module = gpytorch.means.ZeroMean()
        matern = gpytorch.kernels.MaternKernel(
            nu=2.5,
            lengthscale_prior=gpytorch.priors.NormalPrior(1, .001)
        )
        self.covar_module = matern

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)

train_x = np.linspace(-2, 2, 20).reshape(-1 ,1)
train_y = np.sin(train_x).ravel()

likelihood = gpytorch.likelihoods.GaussianLikelihood(
    noise_prior=gpytorch.priors.GammaPrior(.001, 100)
)
model = ExactGPModel(
    torch.tensor(train_x).float(),
    torch.tensor(train_y).float(),
    likelihood
)
model.eval()

test_x = np.linspace(-4, 4, 200).reshape(-1, 1)
test_y = np.sin(test_x).ravel()
pred_y = model(torch.tensor(test_x).float()).loc.detach().numpy()

plt.plot(test_x, test_y, label="True")
plt.plot(test_x, pred_y, label="Pred")
plt.scatter(x, y, label="Obs")
plt.legend()

which gives this figure:

image

In both cases, I’m trying to explicitly set the hyperparameters of the GP; I don’t do any training. In sklearn, I set the lengthscale of the Matern kernel explicitly; in GPyTorch, there’s a prior for this, but I set it so that it is very concentrated around the same value (1). I set the additive noise (alpha) in sklearn to zero, and I try to set the Gaussian noise in GPyTorch to ~0 by using a Gamma prior which is very concentrated near 0. However, it still seems like the GPyTorch GP thinks there is more noise in the data. Is there something else that I need to do to tell GPyTorch that this data is noise-free? Or am I doing something else wrong to try to replicate the sklearn example? I think it might be nice somewhere in the docs to have an example like this - a GP in sklearn and one in GPyTorch to show how to get equivalent behavior.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:14 (13 by maintainers)

github_iconTop GitHub Comments

1reaction
jacobrgardnercommented, Mar 5, 2019

Sorry one more thing: the prior is also a completely optional argument. If you don’t want the lengthscale prior, just don’t pass it to the kernel. In some sense, the minimal kernel definition for Matern you could use here would be:

matern = ScaleKernel(MaternKernel(nu=2.5))

If you want to specify the noise explicitly, you can do:

likelihood.initialize(noise=some_value)

@gpleiss Let’s make a python notebook demonstrating basic functionality for things like initializing hyperparameter values, passing priors, saving and loading models to disk, etc.

0reactions
jacobrgardnercommented, Mar 8, 2019

Ah, yes that makes sense. Some recent commits to master will help the model automatically better adapt to the illconditioned setting, essentially by using more time to get better solves.

Read more comments on GitHub >

github_iconTop Results From Across the Web

GPyTorch Regression Tutorial
For most GP regression models, you will need to construct the following GPyTorch objects: A GP Model ( gpytorch.models.ExactGP ) - This handles...
Read more >
1D Example
1D Example using Different GP Libraries. ... example where I show how one can fit a Gaussian process using the scikit-learn, GPFlow and...
Read more >
Guide To GPyTorch: A Python Library For Gaussian Process ...
GPyTorch is a PyTorch-based library for implementing Gaussian processes. It performs GP inference via Blackbox Matrix-Matrix multiplication.
Read more >
Use Exact Gaussian Process model from GPyTorch as ...
It is too complicated to convert it into aesara in order that it can ... a solution to import the GP-model of GPyTorch...
Read more >
RuntimeError: shape '[265, 265]' is invalid for input of size 265
I send all my code, the problem is in the function predict. ... import os import typing #from sklearn.gaussian_process.kernels import ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found