Is there a way to set the uncertainty of the observations for conditioning ?
See original GitHub issueHowdy folks,
Say we trained a GP on data with some known standard deviation d1
. Then we deploy this GP in the world, but with sensors that might be noisier that the one from which the training data was obtained: d2 > d1
.
I am using model.set_train_data(obs_x, obs_y)
for performing the conditioning on the (noisier) observations, and then I get my predictive distribution as follows:
with torch.no_grad(), gpytorch.settings.fast_pred_var():
test_x = torch.linspace(1000, 0, 100)
predictions = likelihood(model(test_x))
Where do I need to account for the noise of the observations (d2)? Do I set the likelihood to d2?
Thanks in advance
Galto
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (2 by maintainers)
Top Results From Across the Web
How Should We Quantify Uncertainty in Statistical Inference?
Deciding which type of uncertainty is appropriate for scientific inference has been a contentious issue and without proper resolution because it ...
Read more >Uncertainty analysis - Electronics Cooling
Uncertainty analysis is the process of estimating the uncertainty in a result calculated from measurements with known uncertainties.
Read more >Coupled effects of observation and parameter uncertainty on ...
This study highlights how the uncertain parameters of a physically based model and their interactions with uncertain observations can affect ...
Read more >Uncertainty Analyses - Hydrologic Engineering Center
In this first case there is very little uncertainty in the parameter value. ... The first way to create a copy is to...
Read more >Review An overview of methods to evaluate uncertainty of ...
Best way to evaluate uncertainty depends on the models and available information. Abstract. There is an increasing need for environmental management advice that ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @jacobrgardner
I think I am missing something or misunderstood your explanation. The way I am interpreting your explanation is that the measurement noise is added after the prediction (i.e. conditioning) has been done. But I always understood that conditioning (with noise) is done by replacing
Kxx
withKxx + sigma^2*I
in the equations for the predictive mean and predictive covariance. So somehow the measurement noise have to get “inserted” before doing the prediction, no?Your explanation about organizing the data made perfect sense to me, and so if I wanted to do multi-sensor fusion/prediction , for n sensors each with a sigma (
sigma_1^2, sigma_2^2 .. sigma_n^2
) then I would need to do something like this:I am not understanding how adding
DiagLazyTensor([d1, d1, ..., d1, d2, d2, ..., d2]
to the prediction is the equivalent.On top of all that, I want to do this over multi-task GP regression, since each sensor is measuring various properties ( all at the same locations, but with different sensors and thus different noises), like temperature,pressure, % water vapour, etc.
I am still confused about this 😃
So I have used some data
(X_train, Y_train)
for solving the hyper parameters (i.e. for training).This gives me the hyper parameters for my kernel function
K(.,.)
.Now I want to ‘deploy’ kernel
K(.,.)
in a real-life application, and use it on new and unseen data(X1,Y1)
to make predictionsY*
over new locationsX*
.This new data comes from sensors other than the one that was used for producing the data for solving the hyper parameters (i.e. for training) with, and so will have “unseen noise”
VAR_Y1
.So, all in all, I want to implement something like the following:
gp_prior.load_kernel('my_kernel.pt')
gp_posterior = gp_prior.condition_on(X1, Y1, Var_Y1)
Y* = gp_posterior.make_prediction(X*)
Given the equations above from Rasmussen’s book (http://www.gaussianprocess.org/gpml/chapters/RW.pdf), the uncertainties from the online observations need to be included for the computation of the predictive distribution (see equations 2.20 through 2.24, “Predictions using Noisy Observations”).
It’s currently not obvious to me how adding these noises after having computed the predictive distribution by applying a likelihood to it,
likelihood(model(test_x))
is equivalent to the operation described over equations 2.20 through 2.24 in Rasmussen’s text (or the equations in the previous post - in case of conditioning on multiple observations from different sensors with different noises).
I haven’t really dug deep in the GpyTorch code, so I assume that somehow the noise is “extracted” from the likelihood that was passed as argument when instantiating an ExactGP object, and then applied to the predictive distribution computation?