Hello, just another question regarding `mll` and earlystopping.
See original GitHub issueHello, just another question regarding mll
and earlystopping.
If I am using approximate GP and mini-batch training, and if I am supposed to use mll
in training mode to do earlystopping, how do I add the mll
of different batches together? Do I just do a plain addition of all the mll
in each mini-batch?
class GPModel(ApproximateGP):
def __init__(self, inducing_points):
variational_distribution = CholeskyVariationalDistribution(inducing_points.size(0))
variational_strategy = VariationalStrategy(self, inducing_points, variational_distribution, learn_inducing_locations=True)
super(GPModel, self).__init__(variational_strategy)
self.mean_module = gpytorch.means.ConstantMean()
self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel(lengthscale_constraint=gpytorch.constraints.GreaterThan(1)))
def forward(self, x):
mean_x = self.mean_module(x)
covar_x = self.covar_module(x)
return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)
model.train()
likelihood.train()
for x_batch, y_batch in minibatch_iter:
optimizer.zero_grad()
output = model(x_batch)
loss = -mll(output, y_batch.t())
minibatch_iter.set_postfix(loss=loss.item())
loss.backward()
optimizer.step()
running_loss += loss.item() * batch_size
Or rather, do I just set the model to training model, and get the loss on the entire training dataset:
model.train()
likelihood.train()
train_output = model(train_x)
train_loss = -mll(train_x, train_y).item()
and use train_loss
for early stopping?
_Originally posted by @ginward in https://github.com/cornellius-gp/gpytorch/issues/1201#issuecomment-764944490_
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
Use Early Stopping to Halt the Training of Neural Networks At ...
In this section, we will demonstrate how to use early stopping to reduce overfitting of an MLP on a simple binary classification problem....
Read more >Early Stopping in Practice: an example with Keras and ...
Early Stopping monitors the performance of the model for every epoch on a held-out validation set during the training, and terminate the ...
Read more >Early Stopping in Keras to Prevent Overfitting (3.4) - YouTube
Early stopping stops the neural network from training before it begins to seriously overfitting. Generally too many epochs will result in an ......
Read more >Earliest bfp on clearblue digital - N.20 Sala e Tabacchi
Does that mean that it just wont detect at other times of the day or just less chanee. ... 4 BFP's on first...
Read more >Early stopping - Wikipedia
In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@ginward sorry for the slow response
train()
mode. Briefly:train()
mode is for when you want to compute the marginal log likelihood (e.g. the prior predictive). This is one way to measure model fit.eval()
mode is for when you want to compute the posterior predictive.As mentioned in #1445 - the marginal log likelihood already penalizes model complexity through the log determinant factor (see Rasmussen and Williams, Chapter 5).
Of course, it is possible that the learned noise and scale parameters aren’t well fit to the data. You can add priors to these parameters (see https://docs.gpytorch.ai/en/latest/examples/00_Basic_Usage/Hyperparameters.html), which is a form of regularization.
In the future, this type of discussion would be better suited as a Discussion topic. I’m closing this issue for now - and please open a discussion topic if you have any more questions.
Thanks for your response.
When I calculate the mll on the validation set, do I set the model to training model or eval mode? From what I learned from #1201 , it seems that setting it to eval model will only return the posterior, but what we would like to maximize is the prior, right?
Therefore, should I do the following on the validation set:
Or rather, should i simply do the following:
Thanks @jacobrgardner