question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug] High GPU memory consumption of ```ProductKernel``` compared to its equivalent variant

See original GitHub issue

🐛 Bug

I want to apply different kernels on different dimensions for a specific problem, but, I am running out of GPU memory when using GPyTorch. Narrowing down the problem to its simplest version:

  • Using products of multiple RBFKernel on each dimension takes more GPU memory compared to a single RBFKernel with active_dims argument.
  • Memory consumption difference increases with more dimensions.

To reproduce

import torch
import gpytorch


class ExactGPModel(gpytorch.models.ExactGP):
    def __init__(self, train_x, train_y, likelihood, kernel):
        super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
        self.mean_module = gpytorch.means.ConstantMean()
        self.covar_module = kernel

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)


likelihood = gpytorch.likelihoods.GaussianLikelihood().cuda()

n_observations = 10000
dims = 3

train_x = torch.rand(n_observations, dims).cuda()
train_y = (
    torch.sin(train_x[:, 0] * 5)
    + torch.sin(train_x[:, 1] * 10)
    + torch.sin(train_x[:, 2] * 15)
).cuda()

### Version 1
# K = None
# for i in range(dims):
#     if K is None:
#         K = gpytorch.kernels.RBFKernel(ard_num_dims=1, active_dims=[i])
#     else:
#         K = K * gpytorch.kernels.RBFKernel(ard_num_dims=1, active_dims=[i])

### Version 2
K = gpytorch.kernels.RBFKernel(ard_num_dims=dims, active_dims=[0, 1, 2])

model = ExactGPModel(
    train_x, train_y, likelihood, kernel=gpytorch.kernels.ScaleKernel(K)
).cuda()
mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)

model.train()
likelihood.train()

optimizer = torch.optim.Adam(model.parameters(), lr=0.1)

training_iter = 100
for i in range(training_iter):
    optimizer.zero_grad()
    output = model(train_x)
    loss = -mll(output, train_y)
    loss.backward()
    print(
        i + 1,
        training_iter,
        loss.item(),
        model.covar_module.base_kernel.lengthscale,
        model.likelihood.noise.item(),
    )
    optimizer.step()

Expected Behavior

For ### Version 1 and ### Version 2 kernels, GPU should consume the same amount of GPU memory but currently, it consumes 5069 MB for ### Version 1 and 3161 MB for ### Version 2.

System information

  • GPyTorch 1.5.1
  • PyTorch 1.9.0+cu102
  • Ubuntu 18.04

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
gpleisscommented, Nov 8, 2021

Are x1 and x2 integer category values? You can use IndexKernel, and set the covar_factor parameter to be zero, and set the raw_var factor to be 1.

https://docs.gpytorch.ai/en/stable/kernels.html#indexkernel

The IndexKernel is very memory efficient, and is designed to be used as part of a ProductKernel.

0reactions
gpleisscommented, Nov 22, 2021

Sure!

Read more comments on GitHub >

github_iconTop Results From Across the Web

High GPU Memory consumption when scaling controls down
Doing this shows a significant amount(+4GB) of GPU memory consumption for having ~20 billboard GUI's placed. The only difference there is ...
Read more >
Improving GPU Memory Oversubscription Performance
In this post, we dive into the performance characteristics of a micro-benchmark that stresses different memory access patterns for the ...
Read more >
Scratchpad Memory - an overview | ScienceDirect Topics
Scratchpad memory (SPRAM) is a high-speed internal memory directly connected to the CPU core and used for temporary storage to hold very small...
Read more >
Performance and Memory Trade-offs of Deep Learning Object ...
Memory consumption is reported in MB for the GPU platforms. As before, higher accuracy models have an out-of-memory error with larger batch sizes...
Read more >
GMAI: A GPU Memory Allocation Inspection Tool for ...
The constant need for higher performance in these systems has led industry to recently include GPUs. However, GPU software ecosystems are by their...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found