[Question] Implementing Sparse GPs for Batch Independent MultiOutputs
See original GitHub issueQuestion
Not a bug, sorry for the incorrect label, maybe docs/examples would have been better.
I am trying to implement a Sparse GP that takes in a multidimensional input and has an independent multidimensional output. For the regular Exact GP case, I understand adding batch_shape=torch.Size([dim_output])
to the means
and kernels
methods (along with num_tasks=dim_output
to the likelihood) does the trick. However for SparseGP with an InducingPointKernel
when I add batch_shape=torch.Size([dim_output])
I get the following error:
Traceback (most recent call last):
File "multioutput-sgp-example.py", line 57, in <module>
model = BatchIndependentMultitaskSGPModel(train_x, train_y, likelihood)
File "multioutput-sgp-example.py", line 41, in __init__
self.covar_module = gpytorch.kernels.InducingPointKernel(
TypeError: __init__() got an unexpected keyword argument 'batch_shape'
Not adding batch_shape
to the InducingPointKernel
I get the following:
Traceback (most recent call last):
File "multioutput-sgp-example.py", line 82, in <module>
loss = -mll(output, train_y)
File "/usr/lib/python3.8/site-packages/gpytorch/module.py", line 24, in __call__
outputs = self.forward(*inputs, **kwargs)
File "/usr/lib/python3.8/site-packages/gpytorch/mlls/exact_marginal_log_likelihood.py", line 55, in forward
res = res.add(added_loss_term.loss(*params))
File "/usr/lib/python3.8/site-packages/gpytorch/mlls/inducing_point_kernel_added_loss_term.py", line 18, in loss
return 0.5 * (diag / noise_diag).sum()
RuntimeError: The size of tensor a (100) must match the size of tensor b (36) at non-singleton dimension 1
100 is my sample size and 6 is my output dimension.
To reproduce
** Code snippet to reproduce **
import math
import torch
import gpytorch
train_x = torch.randn(100, 9)
train_y = torch.stack([
torch.sin(train_x[:,0] * (2 * math.pi)) + torch.randn(100) * 0.2,
torch.cos(train_x[:,1] * (2 * math.pi)) + torch.randn(100) * 0.2,
torch.sin(train_x[:,2] * (2 * math.pi)) + torch.randn(100) * 0.2,
torch.cos(train_x[:,3] * (2 * math.pi)) + torch.randn(100) * 0.2,
torch.sin(train_x[:,4] * (2 * math.pi)) + torch.randn(100) * 0.2,
torch.cos(train_x[:,5] * (2 * math.pi)) + torch.randn(100) * 0.2,
], -1)
#test_x = torch.linspace(0, 1, 51)
test_x = torch.randn(51, 6)
# Load data onto GPU
if torch.cuda.is_available():
train_x = train_x.cuda()
train_y = train_y.cuda()
test_x = test_x.cuda()
class BatchIndependentMultitaskSGPModel(gpytorch.models.ExactGP):
def __init__(self, train_x, train_y, likelihood):
super().__init__(train_x, train_y, likelihood)
# INPUT: num_inducing_points
num_inducing_points = 30
self.mean_module = gpytorch.means.ConstantMean(batch_shape=torch.Size([6]))
self.base_covar_module = gpytorch.kernels.ScaleKernel(
gpytorch.kernels.RBFKernel(batch_shape=torch.Size([6])),
batch_shape=torch.Size([6])
)
self.covar_module = gpytorch.kernels.InducingPointKernel(
self.base_covar_module,
inducing_points=train_x[:num_inducing_points],
#batch_shape=torch.Size([6]),
likelihood=likelihood
)
def forward(self, x):
mean_x = self.mean_module(x)
covar_x = self.covar_module(x)
return gpytorch.distributions.MultitaskMultivariateNormal.from_batch_mvn(
gpytorch.distributions.MultivariateNormal(mean_x, covar_x)
)
likelihood = gpytorch.likelihoods.MultitaskGaussianLikelihood(num_tasks=6)
model = BatchIndependentMultitaskSGPModel(train_x, train_y, likelihood)
# Load model onto GPU
if torch.cuda.is_available():
model = model.cuda()
likelihood = likelihood.cuda()
# Find optimal model hyperparameters
model.train()
likelihood.train()
# Use the adam optimizer
optimizer = torch.optim.Adam([
{'params': model.parameters()}, # Includes GaussianLikelihood parameters
], lr=0.1)
# "Loss" for GPs - the marginal log likelihood
mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)
training_iterations = 50
for i in range(training_iterations):
optimizer.zero_grad()
output = model(train_x)
loss = -mll(output, train_y)
loss.backward()
print('Iter %d/%d - Loss: %.3f' % (i + 1, training_iterations, loss.item()))
optimizer.step()
** Stack trace/error message ** See above error messages
Expected Behavior
Ideally I should be able to use Sparse GPs in a multi input/ multi output fashion. When adding an InducingPointKernel to existing regular multi input/ multi output Exact GP, (thereby turning it into a SGP model) the model should train and predict properly.
System information
Please complete the following information:
- GPyTorch Version: 1.0.1`
- PyTorch Version: 1.4.0`
- ArchLinux
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (4 by maintainers)
Top GitHub Comments
Yeah I’ll take a look soon. Sorry about that, February has been really crazy for us with both ICML recently and then UAI next week, so we’re really building a backlog of things to look at over the past month…
@jacobrgardner is this now fixed to work with the Batch Independent Multitask GP model, or can we still only use sparse models with batch mode GP?