question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question] Conditional model output when using SGD

See original GitHub issue

I am trying to use a multitask gaussian process for Bayesian optimization with a dataset that has about 10,000 records. In order to speedup training, I am trying to use mini-batch SGD to find the hyperparameters of the model´s kernel and mean (for this I randomly sample my training data, set the training data of the model, calculate the marginal log-likelihood and step the optimizer).

Before moving the model into eval(), I load the whole data to the model by using set_train_data(). However, whenever I perform predictions, I observe that the mean of the model does not match the training examples. Moreover, I would expect the width of confidence interval for the function to decrease when approaching the training points.

My guess is that this is related to how MultitaskGaussianLikelihood works. Is this the case? Since I am trying to use SGD, I would like to make the accuracy of the predictions of the training data in the conditioned model independent of the results achieved during training. Is this possible?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6

github_iconTop GitHub Comments

1reaction
wjmaddoxcommented, Sep 15, 2021

Yes, so what’s likely happening is that there’s a local “optimum” of the parameters which has [short lengthscale, large noise] for the five data point case (an “underfit” GP). Hence why you’re severely under-fitting. In this case, conditioning on the observed data (your posterior) will do not that much and the GP’s posterior will tend to be like the prior.

Here, I profiled out the 5 data regime to visualize, note the large trough of low MLL for a noise around 0.5:

image I also truncated values at 3 b/c they skyrocket past there.

To prevent this from occuring, beyond what I was suggesting previously, you could also place a prior / constraint on the likelihood (which changes the optimization landscape) to prevent it from getting too large before optimizing.

0reactions
fleskovarcommented, Sep 15, 2021

Thank you so much for the explanation, it really helped me understand how things work under the hood. I have just run some tests with priors and constraints on the noise parameter and I am now able to get the results I was expecting.

Additionally, thank you for the link to the paper about SGD and the recommendation on the KroneckerMultiTaskGP (I might come back with additional questions about this later!).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Impose a condition on neural network - Cross Validated
I want that the output of the neural network be greater than specific input value. here for example I want that O >...
Read more >
Machine Learning Interview Questions And Answers
We use the encoder-decoder model to generate an output sequence based on an input sequence. What makes an encoder-decoder model so powerful ...
Read more >
Interpolation, Growth Conditions, and Stochastic Gradient ...
Question : what is the complexity of learning these models using stochastic gradient descent (SGD)? ... [2018], who analyzed SGD under a curvature...
Read more >
SGD optimization in Keras does not move perpendicular to ...
and I ask Keras to find the optimal value of w and b, using SGD as the optimizer and the mean squared error...
Read more >
How to optimize a function using SGD in pytorch - ProjectPro
Here we are creating random tensors for holding the input and output data. Step 4 - Define model and loss function. SGD_model =...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found