slow predictive posterior eval with `keops` + `fast_pred_var()`
See original GitHub issueThe posterior predictive evaluation with a keops
kernel and fast_pred_var()
is very slow. What am I doing wrong?
To reproduce
** Code snippet to reproduce (e.g. using Google Colab, GPU runtime) **
!pip install pykeops
!pip install gpytorch
import torch
import gpytorch
import pykeops
import time
import numpy as np
pykeops.verbose = True
pykeops.clean_pykeops()
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu') # i'm using the GPU
n_samples = 50000
n_samples_val = 25000
n_features = 10
def make_target(X):
# y_train = 1 + x^2 + sin(x) + noise
y = 1.0 + np.square(X).sum(axis=-1) + 0.1*np.random.randn(X.shape[0])
return y.astype(np.float32)
# train data
X_train = np.random.randn(n_samples, n_features).astype(np.float32)
y_train = make_target(X_train)
# validation data
X_val = np.random.randn(n_samples_val, n_features).astype(np.float32)
y_val = make_target(X_val)
train_X = torch.as_tensor(X_train, device=device).contiguous()
train_y = torch.as_tensor(y_train, device=device).contiguous()
val_X = torch.as_tensor(X_val, device=device).contiguous()
val_y = torch.as_tensor(y_val, device=device).contiguous()
class ExactGPModel(gpytorch.models.ExactGP):
def __init__(self, train_x, train_y, likelihood):
super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
self.mean_module = gpytorch.means.ConstantMean()
self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.keops.MaternKernel(nu=2.5))
# self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.MaternKernel(nu=2.5))
def forward(self, x):
mean_x = self.mean_module(x)
covar_x = self.covar_module(x)
return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)
# initialize likelihood and model
likelihood = gpytorch.likelihoods.GaussianLikelihood().to(device)
model = ExactGPModel(train_X, train_y, likelihood).to(device)
def eval_fit() -> float:
# Get into evaluation (predictive posterior) mode
model.eval()
likelihood.eval()
# Test points are regularly spaced along [0,1]
# Make predictions by feeding model through likelihood
start_time = time.time()
with torch.no_grad(), gpytorch.settings.fast_pred_var():
y_pred = likelihood(model(val_X))
y_pred_mean = y_pred.mean.cpu().numpy()
print(f'Eval - elapsed time: {(time.time() - start_time):.3f} sec ...')
return np.sqrt(np.mean(np.square(y_pred_mean - y_val)))
model.train()
likelihood.train()
# Use the adam optimizer
optimizer = torch.optim.Adam(
[
{'params': model.parameters()}, # Includes GaussianLikelihood parameters
],
lr=0.01
)
# "Loss" for GPs - the marginal log likelihood
mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)
training_iter = 5
for i in range(training_iter):
start_time = time.time()
# Zero gradients from previous iteration
optimizer.zero_grad()
# Output from model
output = model(train_X)
# Calc loss and backprop gradients
loss = -mll(output, train_y)
loss.backward()
print('Iter %d/%d - Loss: %.3f lengthscale: %.3f noise: %.3f time:%.3f' % (
i + 1, training_iter, loss.item(),
model.covar_module.base_kernel.lengthscale.item(),
model.likelihood.noise.item(),
time.time() - start_time
))
optimizer.step()
eval_fit()
model.train()
likelihood.train()
Output
Compiling libKeOpstorch53bd9c5b1e in /root/.cache/pykeops-1.4.1-cpython-36/build-libKeOpstorch53bd9c5b1e:
formula: Sum_Reduction(((((Var(0,1,2) * Sqrt(Sum(Square((Var(1,10,0) - Var(2,10,1)))))) + (IntCst(1) + (Var(3,1,2) * Square(Sqrt(Sum(Square((Var(1,10,0) - Var(2,10,1))))))))) * Exp((Var(4,1,2) * Sqrt(Sum(Square((Var(1,10,0) - Var(2,10,1)))))))) * Var(5,11,1)),0)
aliases: Var(0,1,2); Var(1,10,0); Var(2,10,1); Var(3,1,2); Var(4,1,2); Var(5,11,1);
dtype : float32
... Done.
Compiling libKeOpstorch308f0e2d0d in /root/.cache/pykeops-1.4.1-cpython-36/build-libKeOpstorch308f0e2d0d:
formula: Grad_WithSavedForward(Sum_Reduction(((((Var(0,1,2) * Sqrt(Sum(Square((Var(1,10,0) - Var(2,10,1)))))) + (IntCst(1) + (Var(3,1,2) * Square(Sqrt(Sum(Square((Var(1,10,0) - Var(2,10,1))))))))) * Exp((Var(4,1,2) * Sqrt(Sum(Square((Var(1,10,0) - Var(2,10,1)))))))) * Var(5,11,1)),0), Var(1,10,0), Var(6,11,0), Var(7,11,0))
aliases: Var(0,1,2); Var(1,10,0); Var(2,10,1); Var(3,1,2); Var(4,1,2); Var(5,11,1); Var(6,11,0); Var(7,11,0);
dtype : float32
... Done.
Compiling libKeOpstorchaeed646587 in /root/.cache/pykeops-1.4.1-cpython-36/build-libKeOpstorchaeed646587:
formula: Grad_WithSavedForward(Sum_Reduction(((((Var(0,1,2) * Sqrt(Sum(Square((Var(1,10,0) - Var(2,10,1)))))) + (IntCst(1) + (Var(3,1,2) * Square(Sqrt(Sum(Square((Var(1,10,0) - Var(2,10,1))))))))) * Exp((Var(4,1,2) * Sqrt(Sum(Square((Var(1,10,0) - Var(2,10,1)))))))) * Var(5,11,1)),0), Var(2,10,1), Var(6,11,0), Var(7,11,0))
aliases: Var(0,1,2); Var(1,10,0); Var(2,10,1); Var(3,1,2); Var(4,1,2); Var(5,11,1); Var(6,11,0); Var(7,11,0);
dtype : float32
... Done.
Iter 1/5 - Loss: 20.431 lengthscale: 0.693 noise: 0.693 time:118.198
Compiling libKeOpstorch5855ba4b6c in /root/.cache/pykeops-1.4.1-cpython-36/build-libKeOpstorch5855ba4b6c:
formula: Sum_Reduction(((((Var(0,1,2) * Sqrt(Sum(Square((Var(1,10,0) - Var(2,10,1)))))) + (IntCst(1) + (Var(3,1,2) * Square(Sqrt(Sum(Square((Var(1,10,0) - Var(2,10,1))))))))) * Exp((Var(4,1,2) * Sqrt(Sum(Square((Var(1,10,0) - Var(2,10,1)))))))) * Var(5,1,1)),0)
aliases: Var(0,1,2); Var(1,10,0); Var(2,10,1); Var(3,1,2); Var(4,1,2); Var(5,1,1);
dtype : float32
... Done.
Compiling libKeOpstorchebbd8d09bd in /root/.cache/pykeops-1.4.1-cpython-36/build-libKeOpstorchebbd8d09bd:
formula: Sum_Reduction(((((Var(0,1,2) * Sqrt(Sum(Square((Var(1,10,0) - Var(2,10,1)))))) + (IntCst(1) + (Var(3,1,2) * Square(Sqrt(Sum(Square((Var(1,10,0) - Var(2,10,1))))))))) * Exp((Var(4,1,2) * Sqrt(Sum(Square((Var(1,10,0) - Var(2,10,1)))))))) * Var(5,100,1)),0)
aliases: Var(0,1,2); Var(1,10,0); Var(2,10,1); Var(3,1,2); Var(4,1,2); Var(5,100,1);
dtype : float32
... Done.
Eval - elapsed time: 80.717 sec ...
Iter 2/5 - Loss: 19.944 lengthscale: 0.698 noise: 0.698 time:5.715
Eval - elapsed time: 14.651 sec ...
Iter 3/5 - Loss: 19.469 lengthscale: 0.703 noise: 0.703 time:5.717
Eval - elapsed time: 14.667 sec ...
Iter 4/5 - Loss: 19.006 lengthscale: 0.708 noise: 0.708 time:5.695
Eval - elapsed time: 14.709 sec ...
Iter 5/5 - Loss: 18.556 lengthscale: 0.713 noise: 0.713 time:5.707
Eval - elapsed time: 14.928 sec ...
Expected Behavior
Predictive posterior eval should be (much) faster. I would have expected the eval time to be close to the time taken for a single iteration. Instead, I’m looking at 6 vs. 15 seconds, respectively (except for the 1st iteration, of course). If I batch the predictions (with a batch size of, say, 4096) then the first batch takes ca. 14 sec, while the others complete in a fraction of a second.
System information
Please complete the following information:
- GPyTorch Version 1.1.1
- PyTorch Version 1.5.1
- Computer OS: Linux, CUDA 10.2
Additional context
Best way to reproduce this is to run the entire code in a Google Colab (use a GPU runtime).
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (3 by maintainers)
Top Results From Across the Web
GP Regression with LOVE for Fast Predictive Variances and ...
Without LOVE, we accomlish this by performing Cholesky on the posterior covariance matrix. This can be slow for large covariance matrices.
Read more >Sample_posterior_predictive slow performance due to ...
I am trying to sample from my posterior predictive distribution, but sample_posterior_predictive is taking a very long time (a few minutes) ...
Read more >Use of posterior predictive assessments to evaluate model fit ...
In this paper a comparison is made between four methods of model predictive assessment in the context of a three level logistic regression...
Read more >Bayesian Posterior Predictive Checks for Complex Models
First, they review the Bayesian approach to statistics and computation. Second, they discuss the evaluation of model fit in a bivariate probit model....
Read more >Posterior Predictive Distribution
by the prior distribution p(θ). So for some new data value xnew , averaging over p(θ) gives the prior predictive distribution: p(xnew )...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
There will be some loss in accuracy, but that is of course dependent on the dataset. This is probably a good mechanism for speeding up cross validation though! You can wrap your validation loop with this context manager:
1e-3 is the default value. Changing it to 1e-2 might speed things up.
Not necessarily. GPyTorch sets the tolerance of its iterative methods to be much tighter during prediction than during training. So it is likely that we’re running more CG iterations during the prediction loop.
Yes. This is because we make caches of the large predictive computations that make subsequent computations much faster. The first time the prediction code is called, the cache is created. However, if the hyperparameters change (i.e. after a training iteration) then the cache is discarded because it has to be recomputed.
See https://arxiv.org/pdf/1903.08114.pdf - Section 3 paragraph “Predictions”