question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

FittingWithOutlierRemoval doesn't work for model sets

See original GitHub issue

Attempting to use FittingWithOutlierRemoval in place of a simple LinearLSQFitter instance produces a ValueError: Input argument u'x' does not have the correct dimensions in model_set_axis=0 for a model set with n_models=4176. This blocks the very common use case of fitting each row of an image with rejection, as in IRAF’s fit1d. One could, of course, fit each row with a separate model instance, but that would be much slower – and even with model sets, fitting already seems unpleasantly slow compared with IRAF, so it probably wouldn’t be viable.

Unfortunately, I think fixing this might be non-trivial: Within FittingWithOutlierRemoval, L557 (also L567-568) selects only good points from an input array of x co-ordinates, like so:

https://github.com/astropy/astropy/blob/667b7ab621dfa80290fbe36cd711ad0a427ca272/astropy/modeling/fitting.py#L556-L559

Obviously the rejected points will differ from one row/model to another, which means that FittingWithOutlierRemoval cannot simply pass the underlying Fitter a single array of x co-ordinates (in N-1 dimensions) to use for every model. Unlike model evaluation, I think fitting of model sets in general only works with a common x array for all the models, so LinearLSQFitter and friends may need substantial changes for this to work. Moreover, I believe the matrix equation solved by the underlying np.linalg routine assumes a 2D Vandermonde matrix, corresponding to a single set of points in x – I’m not even sure (off the top of my head) that the problem can be expressed in a form that the back-end solver can work with, which would be rather unfortunate.

Would you mind confirming that this is a correct analysis, @nden (assuming you know off hand)? I’ve always had a bit of a hard time discerning from the documentation exactly what dimensionality of inputs the models and fitters accept under different circumstances.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:17 (17 by maintainers)

github_iconTop GitHub Comments

1reaction
jehturnercommented, Nov 23, 2017

Not in the linear fitter. It solves a matrix equation where one of the terms is a Vandermonde matrix corresponding to a common set of points for all the models. If someone can point out an option I’m overlooking, great, but I don’t think so… (note that np.linalg complains if you try to give it a higher-dimensional matrix).

1reaction
jehturnercommented, Nov 17, 2017

I had just started having a go at this when my disk failed 🙁.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Guidelines for Removing and Handling Outliers in Data
Outliers can distort statistical analyses. Learn whether you should remove outliers from your data and how to analyze your data when you can't...
Read more >
How should outliers be dealt with in linear regression analysis?
I've published a method for identifying outliers in nonlinear regression, and it can be also used when fitting a linear model. HJ Motulsky...
Read more >
How to Remove Outliers for Machine Learning
How to use an outlier detection model to identify and remove rows from a training dataset in order to lift predictive modeling performance....
Read more >
The ROUT method of identifying outliers - GraphPad
How the ROUT method of removing outliers works Prism offers a unique approach to identifying and removing outliers, detailed in reference 1.
Read more >
Impact of removing outliers on regression lines - Khan Academy
But even what I hand drew looks like a better fit for the leftover points. And so, clearly the new line that I...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found