locally failing kmeans convergence test (WSL)
See original GitHub issue> pytest -sv sklearn/cluster/tests/test_k_means.py -k test_kmeans_convergence
fails for me for both algorithms:
> assert km.n_iter_ < 300
E AssertionError: assert 300 < 300
E + where 300 = KMeans(algorithm='full', n_clusters=5, n_init=1, random_state=0, tol=0).n_iter_
show_versions:
System:
python: 3.8.3 (default, May 19 2020, 18:47:26) [GCC 7.3.0]
executable: /home/andy/anaconda3/envs/sklearndev/bin/python
machine: Linux-4.4.0-18362-Microsoft-x86_64-with-glibc2.10
Python dependencies:
pip: 20.0.2
setuptools: 46.4.0.post20200518
sklearn: 0.24.dev0
numpy: 1.18.1
scipy: 1.4.1
Cython: 0.29.17
pandas: 1.0.3
matplotlib: 3.1.3
joblib: 0.15.1
threadpoolctl: 2.1.0
Built with OpenMP: True
Issue Analytics
- State:
- Created 3 years ago
- Comments:18 (18 by maintainers)
Top Results From Across the Web
Proof of convergence of k-means - Cross Validated
Since the energy is bounded from below and is constantly being reduced it must converge to a local minimum. Iteration can be stopped...
Read more >When does K-means clustering fail? - Quora
Short answer, there are some known pathological cases like concentric clusters. In general, when the clusters are not well separated, clustering such data ......
Read more >Research issues on K-means Algorithm - Semantic Scholar
This research paper has introduced and tested an improved algorithm to start the k- means with good starting points, which allows k-means to...
Read more >(PDF) A multilevel K-Means algorithm for the clustering problem
Keywords: Clustering problem, multilevel paradigm, K-means. 1 Introduction ... types: hierarchical, partitional, and local search methods.
Read more >What to Do When K-Means Clustering Fails: A Simple yet ...
This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I don’t have a preference. I’m fine with both. I’d also be fine with silently switch to tol=eps when user provides tol=0 😃
Initially this test was added in a PR which goal was to make sure that when tol=0, the iteration loop don’t run until max_iter (due to strict inequality check before).
Since we can’t guarantee that due to floating point errors, I think we should just remove the test.