question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Incorrect calculations of homogeneity, completeness and v-measure

See original GitHub issue

Description

Calculations of homogeneity, completeness and v-measure are now based on the original paper of Rosenberg & Hirschberg 2007. However, while I was doing research on fuzzy clustering evaluation techniques, I found the following paper of Utt et al. 2014 (http://www.lrec-conf.org/proceedings/lrec2014/pdf/829_Paper.pdf) which explained in a footnote that the original definitions of homogeneity and completeness contain typos. They claim it was confirmed by Rosenberg himself via personal communications.

Definitions used:

  • homogeneity = 1 - H(C|K) / H©
  • completeness = 1 - H(K|C) / H(K)

Corrected definitions:

  • homogeneity = 1 - H(C|K) / H(C,K)
  • completeness = 1 - H(K|C) / H(K,C)

Furthermore, since the calculations are now based on the mutual information score, this wouldn’t be correct anymore. Also, the statement in the documentation about it being the same as normalized mutual information with the metric set to ‘arithmetic’ would be false.

Steps/Code to Reproduce

from sklearn.metrics import homogeneity_completeness_v_measure

Expected Results

Actual Results

Versions

System: python: 3.6.7 (v3.6.7:6ec5cf24b7, Oct 20 2018, 13:35:33) [MSC v.1900 64 bit (AMD64)] executable: C:\Users\dtuser\AppData\Local\Programs\Python\Python36\python.exe machine: Windows-7-6.1.7601-SP1 BLAS: macros: lib_dirs: cblas_libs: cblas Python deps: pip: 18.1 setuptools: 40.6.3 sklearn: 0.20.1 numpy: 1.15.4 scipy: 1.1.0 Cython: None pandas: 0.23.4

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
royklipcommented, Jan 29, 2019

Well, I did some tests yesterday and it seems that the joint entropy does not differ that much with the ‘single’ entropy, because the conditional one is relatively small. This led only to a difference in score numbers after 2 decimal. This was however tested on a set were the score was already high (> 0.98).

I just reproduced the examples from the paper which resulted in much higher differences.

0reactions
royklipcommented, Jan 29, 2019

So far, it looks like they used the single entropy for their examples, at least the calculations give back the same scores. If I use the joint entropy the results differ by quite a bit.

I used the following code, which is fairly the same as the one in sklearn expect for the joint entropy addition:

def homogeneity_completeness_v_measure(labels_true, labels_pred):
    labels_true, labels_pred = check_clusterings(labels_true, labels_pred)

    if len(labels_true) == 0:
        return 1.0, 1.0, 1.0

    entropy_true = entropy(labels_true)
    entropy_pred = entropy(labels_pred)

    contingency = contingency_matrix(labels_true, labels_pred, sparse=True)
    mi = mutual_info_score(None, None, contingency)
    cond_entropy_tp = entropy_true - mi
    cond_entropy_pt = entropy_pred - mi

    # Same as: joint_entropy = entropy_true + cond_entropy_pt
    joint_entropy = entropy_pred + cond_entropy_tp

    homogeneity = 1 - cond_entropy_tp / joint_entropy if joint_entropy else 1.0
    completeness = 1 - cond_entropy_pt / joint_entropy if joint_entropy else 1.0

    if homogeneity + completeness == 0.0:
        v_measure_score = 0.0
    else:
        v_measure_score = (2.0 * homogeneity * completeness /
                           (homogeneity + completeness))

    return homogeneity, completeness, v_measure_score
Read more comments on GitHub >

github_iconTop Results From Across the Web

A Conditional Entropy-Based External Cluster Evaluation ...
In Section 2, we describe V-measure and how it is calculated in terms of homogeneity and complete- ness. We describe several popular external...
Read more >
V measure: an homogeneous and complete clustering
V measure : an homogeneous and complete clustering ... Homogeneity measures how much the sample in a cluster are similar.
Read more >
ML | V-Measure for Evaluating Clustering Performance
The calculation of the V-Measure first requires the calculation of two ... Note: The term homogeneous is different from completeness in the ...
Read more >
2.3.9.3. Homogeneity, completeness and V-measure
2.3.9.3. Homogeneity, completeness and V-measure Given the knowledge of the ground truth class assignments of the samples, it is possible to ...
Read more >
Robust Assessment of Clustering Methods for Fast Radio ...
We use homogeneity, completeness, v measure, and recall to calculate the metric (hereafter referred to as score).
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found