question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[feature request] GPLVMs with missing data

See original GitHub issue

I’m interested in using the pyro.contrib.gp.models.GPLVM model in the case where some points in Y (the observed data) are missing at random.

GPy has an example here.

This functionality isn’t available in pyro at the moment (I think - couldn’t find examples for this). Is this something that can be added? I’d be happy to contribute some code to do this.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
InfProbSciXcommented, Oct 7, 2020

This worked! This is the class for reference:

class MaskedGaussian(gp.likelihoods.Gaussian):
    def forward(self, f_loc, f_var, y=None):
        y_var = f_var + self.variance
        y_dist = dist.Normal(f_loc, y_var.sqrt())

        if y is not None:
            if y.isnan().any():
                y_dist = dist.MaskedDistribution(y_dist, ~y.isnan())
                y = torch.masked_fill(y, y.isnan(), -999.)

            y_dist = y_dist.expand_by(y.shape[:-f_loc.dim()]).to_event(y.dim())
        return pyro.sample(self._pyro_get_fullname("y"), y_dist, obs=y)

I think I’m slightly in favour of changing the existing Gaussian likelihood class actually, and I can add an optional argument to the __init__ function like mask_missing_data=False as there’s a lot of repeated code here. What would you say?

Also, another quick comment, would it be possible to use this likelihood with the (Sparse)GPRegression classes? I ask because likelihood doesn’t seem to be an optional argument for those classes, so perhaps it’d be a bit more work to get them to work under the missing data regime.

1reaction
fehiepsicommented, Oct 7, 2020

@InfProbSciX I think you will need to also do y = torch.masked_fill(y, y.isnan(), -999.) if y has NaN value. This note explains well why NaN happens during backward.

By the way, the MaskedGaussian interface looks great to me. 👍

Read more comments on GitHub >

github_iconTop Results From Across the Web

Missing data imputation and heart failure readmission prediction
The GPLVM-based missing data imputation can provide both the mean ... This model can provide a mean estimate of each missing feature along ......
Read more >
Gaussian Process Latent Variable Flows for ... - OpenReview
We compare this framework with traditional models like the Bayesian GPLVM. Our experiments focus on massively missing data settings. Reply Type: all.
Read more >
Missing data imputation and heart failure readmission prediction
perform missing data imputation using GPLVM. This model can provide a mean estimate of each missing feature along with the variance ...
Read more >
arXiv:2202.12979v1 [cs.LG] 25 Feb 2022
Generalised Gaussian Process Latent Variable Models (GPLVM) with ... the presence of massively missing data and ob-.
Read more >
Applications and Extensions of GPLVM
Application: Deal with missing data in mocap. [K. Grochow, S. Martin, A. Hertzmann ... Is training with so little data a bug or...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found