question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Hello, just another question regarding `mll` and earlystopping.

See original GitHub issue

Hello, just another question regarding mll and earlystopping.

If I am using approximate GP and mini-batch training, and if I am supposed to use mll in training mode to do earlystopping, how do I add the mll of different batches together? Do I just do a plain addition of all the mll in each mini-batch?

class GPModel(ApproximateGP):
    def __init__(self, inducing_points):
        variational_distribution = CholeskyVariationalDistribution(inducing_points.size(0))
        variational_strategy = VariationalStrategy(self, inducing_points, variational_distribution, learn_inducing_locations=True)
        super(GPModel, self).__init__(variational_strategy)
        self.mean_module = gpytorch.means.ConstantMean()
        self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel(lengthscale_constraint=gpytorch.constraints.GreaterThan(1)))

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)
        model.train()
        likelihood.train()
        for x_batch, y_batch in minibatch_iter:
            optimizer.zero_grad()
            output = model(x_batch)
            loss = -mll(output, y_batch.t())
            minibatch_iter.set_postfix(loss=loss.item())
            loss.backward()
            optimizer.step()
            running_loss += loss.item() * batch_size

Or rather, do I just set the model to training model, and get the loss on the entire training dataset:

        model.train()
        likelihood.train()
        train_output = model(train_x)
        train_loss = -mll(train_x, train_y).item()

and use train_loss for early stopping?

_Originally posted by @ginward in https://github.com/cornellius-gp/gpytorch/issues/1201#issuecomment-764944490_

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
gpleisscommented, Feb 18, 2021

@ginward sorry for the slow response

When I calculate the mll on the validation set, do I set the model to training model or eval mode? From what I learned from #1201 , it seems that setting it to eval model will only return the posterior, but what we would like to maximize is the prior, right?

train() mode. Briefly:

  • train() mode is for when you want to compute the marginal log likelihood (e.g. the prior predictive). This is one way to measure model fit.
  • eval() mode is for when you want to compute the posterior predictive.

I wonder what the recommended method is w.r.t. regularization and preventing overfitting for simple GP regression models? I would like some early stopping condition to avoid having to tune the number of training steps.

As for regularisation, I found that adjusting for the noise and scale parameters, as well as the length scale parameters work quite well.

As mentioned in #1445 - the marginal log likelihood already penalizes model complexity through the log determinant factor (see Rasmussen and Williams, Chapter 5).

Of course, it is possible that the learned noise and scale parameters aren’t well fit to the data. You can add priors to these parameters (see https://docs.gpytorch.ai/en/latest/examples/00_Basic_Usage/Hyperparameters.html), which is a form of regularization.

In the future, this type of discussion would be better suited as a Discussion topic. I’m closing this issue for now - and please open a discussion topic if you have any more questions.

1reaction
ginwardcommented, Jan 22, 2021

If you want to do early stopping based on like a validation set nll, what you would normally do is every say k epochs you can process your validation set in batches and compute a standard test nll on the validation set. Then, at the end of a full training procedure you can choose the model with the best validation nll.

I doubt that doing early stopping on the training loss would be very useful – the training loss is not something you’d expect to start trending upwards. It’s not exactly monotonic, but that’s mostly due to fluctuations during training, and should generally be decreasing over time.

Thanks for your response.

When I calculate the mll on the validation set, do I set the model to training model or eval mode? From what I learned from #1201 , it seems that setting it to eval model will only return the posterior, but what we would like to maximize is the prior, right?

Therefore, should I do the following on the validation set:

        model.train()
        likelihood.train()
        val_output = model(val_x)
        val_loss = -mll(val_x, val_y).item()

Or rather, should i simply do the following:

        model.eval()
        likelihood.eval()
        nll = - likelihood.log_marginal(val_y, model(val_x)).mean()

Thanks @jacobrgardner

Read more comments on GitHub >

github_iconTop Results From Across the Web

Use Early Stopping to Halt the Training of Neural Networks At ...
In this section, we will demonstrate how to use early stopping to reduce overfitting of an MLP on a simple binary classification problem....
Read more >
Early Stopping in Practice: an example with Keras and ...
Early Stopping monitors the performance of the model for every epoch on a held-out validation set during the training, and terminate the ...
Read more >
Early Stopping in Keras to Prevent Overfitting (3.4) - YouTube
Early stopping stops the neural network from training before it begins to seriously overfitting. Generally too many epochs will result in an ......
Read more >
Earliest bfp on clearblue digital - N.20 Sala e Tabacchi
Does that mean that it just wont detect at other times of the day or just less chanee. ... 4 BFP's on first...
Read more >
Early stopping - Wikipedia
In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found