question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

NMI and AMI use inconsistent definitions of mutual information

See original GitHub issue

There exist many defintions of NMI and AMI.

Vinh, N. X., Epps, J., & Bailey, J. (2010). Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. Journal of Machine Learning Research, 11(Oct), 2837-2854.

mention 5 different definitions of NMI, and based on that give 4 different AMI.

The NMI implemented in sklearn uses sqrt(H(U), H(V)) for normalization. The AMI implemented in sklearn uses max(H(U), H(V)) for normalization.

There exists an NMI with the max normalization, and a AMI with the sqrt normalization, so this is inconsistent in sklearn. Ideally, they would both use the same definition by default, and allow using any of the others via an option.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
aryamccarthycommented, May 24, 2018

Ooh, a twist. Sum is actually what V-measure uses—not sqrt. It seems we’ve covered the entire gamut. I’m going to take that as another argument in favor of sum. << Thought I hit ‘Comment’ on this some time ago.

1reaction
aryamccarthycommented, May 24, 2018

I’ve created a PR; waiting for tests to pass. I think converging on sqrt is best for uniformity with V-measure. EDIT: Nope, it’s not sqrt. It’s sum.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Correction for Closeness: Adjusting Normalized Mutual ...
Abstract. Normalized mutual information (NMI) is a widely used measure to compare community detection methods.
Read more >
On Normalized Mutual Information: Measure Derivations and ...
Normalized mutual information (NMI) measures are then obtained from those bounds, emphasizing the use of least upper bounds.
Read more >
Normalized Mutual Information
NMI is a good measure for determining the quality of clustering. • It is an external measure because we need the class labels...
Read more >
sklearn.metrics.normalized_mutual_info_score
Normalized Mutual Information (NMI) is a normalization of the Mutual Information (MI) score to scale the results between 0 (no mutual information) and...
Read more >
Systematic Analysis of Cluster Similarity Indices
Adjusted Mutual Information addresses for the bias of NMI by subtracting the expected mutual information (Vinh et al.,. 2009). It is given by....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found