[Question] eval_cg_tolerance dramatic effect on variance
See original GitHub issueHello 👋
Changing gpytorch.settings.eval_cg_tolerance()
during evaluation produces a dramatic change in the predicted variance on a Multitask GP wit Gaussian Likelihood.
I changed only the default tolerance 1E-2 to 1E-6 and I obtained the following posterior variance plots.
Is this expected behavior?
Reproduce the issue
The model
class MultitaskGPModel(ExactGP):
def __init__(
self, train_x, train_y, likelihood,
):
super(MultitaskGPModel, self).__init__(train_x, train_y, likelihood)
self.mean_module = LinearMean(input_size = 4)
self.covar_module = RBFKernel(
#lengthscale_constraint=GreaterThan(torch.Tensor([0.2]))
)
self.task_covar_module = IndexKernel(
num_tasks=2,
rank=2,
prior=LKJCovariancePrior(
2, eta=0.3, sd_prior=SmoothedBoxPrior(0, 1), validate_args=False
),
)
def forward(self, x, i):
mean_x = self.mean_module(x)
covar_x = self.covar_module(x)
covar_i = self.task_covar_module(i)
covar = covar_x.mul(covar_i)
return MultivariateNormal(mean_x, covar)
class LinearMean(gpytorch.means.Mean):
def __init__(self, input_size, batch_shape=torch.Size(), bias=True):
super().__init__()
self.register_parameter(name="weights", parameter=torch.nn.Parameter(torch.ones(*batch_shape, input_size, 1)))
if bias:
self.register_parameter(name="bias", parameter=torch.nn.Parameter(torch.ones(*batch_shape, 1)))
else:
self.bias = None
def forward(self, x):
res = x.matmul(self.weights).squeeze(-1)
if self.bias is not None:
res = res + self.bias
return res
The code
Pre-trained model with noise covariance = 0.1 from GaussianLikelihood Data and model dict attached
likelihood = GaussianLikelihood()
full_train_x, full_train_i, full_train_y = loadTrainData(stage)
model = MultitaskGPModel((full_train_x, full_train_i), full_train_y, likelihood)
state_dict = torch.load("state_dict.pt")
model.load_state_dict(state_dict)
dtype=torch.float64
likelihood = likelihood.to(device, dtype)
likelihood.eval()
model = model.to(device, dtype)
model.eval()
X = torch.arange(500,5000+1,100)
X = torch.Tensor(X)
X = X.to(device, dtype)
task_vec = torch.ones(X.shape[0]).to(device, dtype)
with torch.no_grad(), gpytorch.settings.eval_cg_tolerance(1E-6):
posterior = likelihood(model(X, task_vec))
mean = posterior.mean.detach()
variance = posterior.variance.detach()
plt.plot(X[:,0], mean, color='black', linestyle='dashed')
plt.fill_between(grid[:,0], mean, mean + std, alpha=0.4, color='blue')
plt.fill_between(grid[:,0], mean, mean - std, alpha=0.4, color='blue')
# same fill_between for more variance bands
Expected behavior
I expected evaluating with the default CG tolerance 1E-2 would give results according to the white noise hyper parameter = 0.1
Thank you!
Issue Analytics
- State:
- Created a year ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
4.5.3 Calculating the variance and standard deviation
The variance is mean squared difference between each data point ... A single extreme value can have a big impact on the standard...
Read more >Why does the variance of a sample change if the observations ...
The effect gradually goes away with increasing sample size, as n−1n goes to 1 as n→∞. There's no particular reason you have to...
Read more >Why Variances Add—And Why It Matters - AP Central
Question 1: Why add variances instead of standard deviations? We always calculate variability by summing squared deviations from the mean. That gives us...
Read more >What Is Variance in Statistics? Definition, Formula, and Example
Variance is a measurement of the spread between numbers in a data set. Investors use the variance equation to evaluate a portfolio's asset...
Read more >Measures of Variability: Range, Interquartile Range, Variance ...
Neither measure is influenced dramatically by outliers because they don't depend on every value. Additionally, the interquartile range is excellent for skewed ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Sorry, forgot to mention that training and evaluation are done with data re-scaled to the hypercube not z-score tho. Will try z-score. Thanks!
@irinaespejo can you please post a fully run-able code example - i.e. something that I can copy-paste into a script and reproduce the results that you see?