Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

locally failing kmeans convergence test (WSL)

See original GitHub issue

> pytest -sv sklearn/cluster/tests/test_k_means.py -k test_kmeans_convergence

fails for me for both algorithms:

>       assert km.n_iter_ < 300
E       AssertionError: assert 300 < 300
E        +  where 300 = KMeans(algorithm='full', n_clusters=5, n_init=1, random_state=0, tol=0).n_iter_

show_versions:

System:
    python: 3.8.3 (default, May 19 2020, 18:47:26)  [GCC 7.3.0]
executable: /home/andy/anaconda3/envs/sklearndev/bin/python
   machine: Linux-4.4.0-18362-Microsoft-x86_64-with-glibc2.10

Python dependencies:
          pip: 20.0.2
   setuptools: 46.4.0.post20200518
      sklearn: 0.24.dev0
        numpy: 1.18.1
        scipy: 1.4.1
       Cython: 0.29.17
       pandas: 1.0.3
   matplotlib: 3.1.3
       joblib: 0.15.1
threadpoolctl: 2.1.0

Built with OpenMP: True

Issue Analytics

State:
Created 3 years ago
Comments:18 (18 by maintainers)

Top GitHub Comments

1reaction

jeremiedbbcommented, Jun 2, 2020

I don’t have a preference. I’m fine with both. I’d also be fine with silently switch to tol=eps when user provides tol=0 😃

1reaction

jeremiedbbcommented, Jun 2, 2020

Initially this test was added in a PR which goal was to make sure that when tol=0, the iteration loop don’t run until max_iter (due to strict inequality check before).

Since we can’t guarantee that due to floating point errors, I think we should just remove the test.

Top Results From Across the Web

Proof of convergence of k-means - Cross Validated

Since the energy is bounded from below and is constantly being reduced it must converge to a local minimum. Iteration can be stopped...

When does K-means clustering fail? - Quora

Short answer, there are some known pathological cases like concentric clusters. In general, when the clusters are not well separated, clustering such data ......

Research issues on K-means Algorithm - Semantic Scholar

This research paper has introduced and tested an improved algorithm to start the k- means with good starting points, which allows k-means to...

(PDF) A multilevel K-Means algorithm for the clustering problem

Keywords: Clustering problem, multilevel paradigm, K-means. 1 Introduction ... types: hierarchical, partitional, and local search methods.

What to Do When K-Means Clustering Fails: A Simple yet ...

This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically ...