question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

WIP: Addition of MLE for stats.invgauss/wald

See original GitHub issue

Part of the #11782 issue.

@mdhaber,

I have been investigating the maximum likelihood estimators given in Chapter 25, Inverse Gaussian (Wald) Distribution, but the results from the MLEs don’t seem to come close to the results of either the wald or invgauss distributions in scipy.

Here is how the PDF is described in the textbook we have been using, on page 120: Screen Shot 2020-07-02 at 5 01 53 PM

And here are the equations for MLE. Screen Shot 2020-07-02 at 5 02 22 PM

It may be another issue with terminology, since the use of mu to represent the location, as the textbook describes, is implemented as a shape parameter in stats.invgauss. However, the mean of a random generate only appears to represent the mu used in generation when loc and scale are the standard loc=0, scale=1.

For example,

from scipy.stats import invgauss
data = invgauss.rvs(mu=3.25, size=10000)
print(data.mean())

=> 3.223328.... Just one example, but it appears that in general it matches.

With other location and scale values it doesn’t match.

from scipy.stats import invgauss
data = invgauss.rvs(mu=3.25, scale=2, size=10000)
print(data.mean())

=> 6.62447.... Just one example, but in general they don’t match.

The code for the scale is

scale = len(data) / (np.sum(data**(-1) - mu**(-1)))

and I played around with both of these equations quite a bit to try to get them to match up with / or be better than the default fit method, but other than the relation I described above, I’m not sure that these will work for these distributions. It may be possible to use the mean to determine the shape parameter in invgauss if the person has already fixed floc=0, scale=1, but that seems like a very niche usage.

What are your thoughts?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
mdhabercommented, Jul 7, 2020

I think if you go a bit further, you’ll find with loc=0 that they are equivalent with image

So if the user passes in floc=0, you could use the equations from the book to get the book’s version of the parameters, then use the relationships above to get SciPy’s parameters.

Update: Yes:

import numpy as np
from scipy.stats import invgauss
data = invgauss.rvs(mu=3.25, scale=2, size=1000000)

mu = np.mean(data)
s = len(data) / (np.sum(data**(-1) - mu**(-1)))
mu_s = mu/s
print(mu_s, s)

gives

3.2698217368812785 1.9933346841496646

If the user doesn’t pass in floc=0, then you could at least use the analytical solution above (assuming floc=0) as a guess to the super fit method. You might try following @WarrenWeckesser’s argument here about weibull_min to see if it applies to this distribution.

0reactions
swallancommented, Jul 7, 2020

@mdhaber @WarrenWeckesser

Thanks for the pointers! I really appreciate it. I’ll create a PR for this soon.

Read more comments on GitHub >

github_iconTop Results From Across the Web

1.2 - Maximum Likelihood Estimation | STAT 415
Suppose we have a random sample X 1 , X 2 , ⋯ , X n whose assumed probability distribution depends on some...
Read more >
Maximum Likelihood Estimation (MLE) in layman terms
MLE will pick the Gaussian (i.e., the mean and variance) that is "most ... One task in statistics is to fit a distribution...
Read more >
Topic 15: Maximum Likelihood Estimation - Arizona Math
The right column is based on 40 trials having 16 and 22 successes. Notice that the maximum likelihood is approximately. 10−6 for 20...
Read more >
Maximum likelihood estimation - Wikipedia
In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data ...
Read more >
A Gentle Introduction to Maximum Likelihood Estimation
In statistics, maximum likelihood estimation (MLE) is a method of estimating ... Thus, the probabilities that attach to the possible results must sum...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found