question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Null hypothesis of Kolmogorov Smirnov test is not correctly described

See original GitHub issue

The statement

Under the null hypothesis, the two distributions are identical, F(x)=G(x). The alternative hypothesis can be either ‘two-sided’ (default), ‘less’ or ‘greater’.

in the documentation of stats.kstest (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kstest.html) is not correct unless alternative == 'two-sided'. The other alternatives test F >= G vs F < G and vice versa.

Same problem for stats.ks_2samp (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ks_2samp.html#scipy.stats.ks_2samp):

This is a two-sided test for the null hypothesis that 2 independent samples are drawn from the same continuous distribution.

and stats.ks_1samp (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ks_1samp.html):

Under the null hypothesis, the two distributions are identical, F(x)=G(x).

The alternatives are correctly described in the R documentation: https://stat.ethz.ch/R-manual/R-patched/library/stats/html/ks.test.html

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:12 (12 by maintainers)

github_iconTop GitHub Comments

1reaction
mdhabercommented, Jul 20, 2020

This acknowledges the fact that sometimes the null hypothesis is written differently depending on the alternative, but that writing the null hypothesis the same way in all cases is also acceptable.

However, be aware that many researchers (including one of the co-authors in research work) use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

I favor the way it is currently written, as I think the distribution we used to determine the p-value should be derived from the null hypothesis (precisely as it is stated). For the KS test, does the distribution used to calculate p-values depend on the alternative hypothesis being tested? If not, I think that the way the null hypothesis is written is not incorrect regardless of the alternative.

0reactions
josef-pktcommented, Nov 27, 2020

sounds fine to me then to use the last version with weak inequality in null.

I’m sticking to equality null in statsmodels, because I don’t want to get into composite nulls when we don’t need or use it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

1.3.5.16. Kolmogorov-Smirnov Goodness-of-Fit Test
The Kolmogorov-Smirnov (K-S) test is based on the empirical distribution function ... As expected, the null hypothesis is not rejected for the normally ......
Read more >
Distribution fitted well but kolmogorov -smirnov test not ...
I would always advise against using any statistical test that tries to quantify whether two distributions are similar, ...
Read more >
Kolmogorov Smirnov Test - an overview | ScienceDirect Topics
3.3.​​ The null hypothesis is rejected at the α-level if where n1 and n2 denote the number of samples from each observation vector...
Read more >
Kolmogorov-Smirnov and Kuiper's Tests of Time Variability
The null hypothesis is rejected if the value of the K-S statistic, D (defined below), is larger than a certain value. Corrections are...
Read more >
Kolmogorov–Smirnov test - Wikipedia
In the two-sample case (see Section 3), the distribution considered under the null hypothesis is a continuous distribution but is otherwise unrestricted.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found