question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: scipy.stats.multiscale_graphcorr p-values are computed differently from literature and other packages

See original GitHub issue

Describe your issue.

This bug is as described as here: https://github.com/neurodata/hyppo/issues/124. The issue is reproduced here:

p-values do not appear to be computed correctly. Literature review/review of similar independence testing code seems to suggest most people use the approach suggested in Phipson et al., 2011 described in 6.2 to always include the given ordering as a permutation. Should be updated for all permutation-based approaches to my knowledge.

Reproducing Code Example

from scipy.stats import multiscale_graphcorr
import numpy as np

X = np.arange(0, 25)
Y = np.arange(0, 25)
stat, pval, _ = multiscale_graphcorr(X, Y, reps=100)
print(pval)
print(stat)
>> .01
>> 1

p-value obtained is 1/100.

R example using energy package (contains other nonparametric multivariate independence tests using permutation tests), with energy version 1.7-7 and R version 4.0.2:

require(energy)
X = 0:25; Y = 0:25; result = dcor.test(X, Y, R=100)
print(result$p.value)
>> .0099...
print(result$stat)
>> 1

p-value obtained is 1/(100 + 1). We should always use the result from Phipson et. al 2011, rather than only in the case where the p-value would otherwise be 0.

Error message

There is no error message for this issue. The proposed change is very small. It would require changing this line (https://github.com/neurodata/hyppo/pull/223):


# calculate p-value and significant permutation map through list
pvalue = (null_dist >= stat).sum() / reps

# correct for a p-value of 0. This is because, with bootstrapping
# permutations, a p-value of 0 is incorrect
if pvalue == 0:
    pvalue = 1 / reps

to:

pvalue = ((null_dist >= stat).sum() + 1) / (1 + reps)

SciPy/NumPy/Python version information

1.7.0 1.21.1 sys.version_info(major=3, minor=8, micro=5, releaselevel='final', serial=0)d

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
sampan501commented, Nov 11, 2021

@tupui I removed the GPL code. I think that the bug still stands as mentioned in Phipson et al., 2011

0reactions
mdhabercommented, Mar 29, 2022

@sampan501 just thought I’d ping you about this so we don’t forget. Please mention me when you open one. Thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

scipy.stats.multiscale_graphcorr — SciPy v1.9.3 Manual
Computes the Multiscale Graph Correlation (MGC) test statistic. Specifically, for each point, MGC finds the -nearest neighbors for one property (e.g. cloud ...
Read more >
Why I am getting different P values if using different packages
I am trying to compare categorical data from 2 groups. Yes No GrpA: [152, 220] GrpB: [187, 350]. However, I am getting different...
Read more >
Complexity, Fractals, and Entropy - GitHub Pages
This method makes use of a statistic within the reconstructed phase space, rather than analyzing the temporal evolution of the time series. However,...
Read more >
Python: Correlation and P-value in Numpy, Pandas, and Scipy
This playlist (or related videos) is included in my online book: https://www.myeducator.com/reader/web/1582bs/. You can purchase a single ...
Read more >
Metrics for graph comparison: A practitioner's guide | PLOS ONE
In this work, we compare commonly used graph metrics and distance measures, and demonstrate their ability to discern between common topological ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found