question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Rescale predictive_gradients when normalizer=True

See original GitHub issue

Hello,

I was expecting model.predictive_gradients to return the gradients of model.predict, but it looks like it actually returns the gradients of model._raw_predict. When normalizer=True, the user has to correct the results of model.predictive_gradients manually. (See code below.)

This is problematic when model.predictive_gradients is used in an acquisition function in the context of Bayesian Optimization or Experimental Design, as in GPyOpt or Emukit.

The easy solution is to edit the docstring of model.predictive_gradients to make clear that normalization is not included. A more durable solution is to rescale the output of model.predictive_gradients, similar to model.predict or model.predict_quantiles. (I suspect that model.predict_jacobian has the same problem but I haven’t checked.)

If I am missing anything, please let me know.

Thanks, Antoine

import numpy as np
import GPy

np.random.seed(2)

X = np.random.rand(15,3)
Y = np.random.rand(15,1)

normalize_Y = True
ker = GPy.kern.RBF(input_dim=X.shape[1])
model = GPy.models.GPRegression(X=X, Y=Y, kernel=ker,
                                normalizer=normalize_Y)

x = np.array([[-0.3, 0.1, 0.5]])
mu, var = model.predict(x)
mu_jac, var_jac = model.predictive_gradients(x)

# Finite-difference approximation
eps = 1e-8
mu_jac_num = np.zeros(x.shape[1])
var_jac_num = np.zeros(x.shape[1])
for ii in range(x.shape[1]):
    x_eps = x + eps*np.eye(1, x.shape[1], ii)
    mu_eps, var_eps = model.predict(x_eps)
    mu_jac_num[ii] = (mu_eps-mu)/eps
    var_jac_num[ii] = (var_eps-var)/eps

print('MU')
print(mu_jac.ravel())
print(mu_jac_num.ravel())
if normalize_Y:
    print(mu_jac.ravel() * model.normalizer.std)

print('VAR')
print(var_jac.ravel())
print(var_jac_num.ravel())
if normalize_Y:
    print(model.normalizer.inverse_variance(var_jac).ravel())

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5

github_iconTop GitHub Comments

2reactions
ablanchacommented, Jan 13, 2020

Not that I know of. I’ll see if I can make a pull request in the next few days.

Along the same lines, I think most people would expect that evaluating posterior_covariance_between_points between a point x and itself should return the same thing as model.predict(x)[1]. Currently this is not the case because posterior_covariance_between_points does not account for likelihood noise or normalization.

0reactions
ablanchacommented, Apr 17, 2020

Amir, the posterior mean and variance are fine when normalizer=True. The only problem was with the gradients of the posterior mean and variance when normalizer=True, in that the gradients did not match those computed by finite differences of model.predict. Before the fix, the following snippet would raise an error, which it shouldn’t. (Again, the culprit in the below is model.predictive_gradients, not model.predict.)

Let me know if this helps.

import numpy as np
import GPy
from GPy.models import GradientChecker

N, M, Q = 10, 15, 3
X = np.random.rand(M,Q)
Y = np.random.rand(M,1)
x = np.random.rand(N, Q)
model = GPy.models.GPRegression(X=X, Y=Y, normalizer=True)
gm = GradientChecker(lambda x: model.predict(x)[0],
                     lambda x: model.predictive_gradients(x)[0],
                     x, 'x')
gc = GradientChecker(lambda x: model.predict(x)[1],
                     lambda x: model.predictive_gradients(x)[1],
                     x, 'x')
assert(gm.checkgrad())
assert(gc.checkgrad())
Read more comments on GitHub >

github_iconTop Results From Across the Web

machine learning - Is it necessary to scale the target value in ...
Let's first analyse why feature scaling is performed. Feature scaling improves the convergence of steepest descent algorithms, ...
Read more >
9.3 Feature Scaling via Standard Normalization
As we can see in the plot above, the distributions of our input features here are way out of scale with each other,...
Read more >
How to use Data Scaling Improve Deep Learning Model ...
Data scaling can be achieved by normalizing or standardizing real-valued input and output variables. How to apply standardization and ...
Read more >
Machine Learning: When to perform a Feature Scaling? - atoti
Feature scaling is a method used to normalize the range of independent variables or features of data. In data processing, it is also...
Read more >
Feature Scaling - Machine Learning with PyTorch
The solution is simple: we need to rescale the input features before training. This is exactly what happened when we mysteriously divided by...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found