Simplify spectral clustering solver logic
See original GitHub issueWe need to simplify the logic for selecting a solver in spectral_clustering
.
See discussion here:
https://github.com/scikit-learn/scikit-learn/issues/10715#issuecomment-369236982
https://github.com/scikit-learn/scikit-learn/pull/14647#issuecomment-521294241
https://github.com/scikit-learn/scikit-learn/pull/10720#issuecomment-518878136
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
A Tutorial on Spectral Clustering - People
It is simple to implement, can be solved efficiently by standard linear algebra software, and very often outperforms traditional clustering algorithms such as ......
Read more >A Tutorial on Spectral Clustering
It is simple to implement, can be solved efficiently by standard linear algebra software, and very often outperforms traditional clustering algorithms such as ......
Read more >Understanding Spectral Clustering - Eran Raviv
Spectral clustering includes a processing step to help solve non-linear problems, such that they ... A typical spectral clustering algorithm.
Read more >Self-constrained Spectral Clustering | IEEE Journals & Magazine
To solve this problem, we propose a self-constrained spectral clustering algorithm. In this algorithm, we extend the objective function of ...
Read more >Foundations of a Multi-way Spectral Clustering Framework for ...
Different strategies have been proposed to solve this problem, however, ... the Theoretical Spectral Curvature Clustering (TSCC) algorithm for solving the ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
After we finish the refactoring and achieve a clear code, there are a couple of candidates for improving the algorithm. I found those by reading the issues referenced here. For now I have 2 ideas based on @lobpcg suggestions .
call scipy.linalg.eigh() always if n_nodes < 5 * n_components. Meaning, if number of samples is smaller than the (number_of_requested_eigen_vectors x 5) original comment: https://github.com/scikit-learn/scikit-learn/pull/14647#issuecomment-521076953
When calling arpack with shift-invert use a sigma value of -1e-5 instead of 1.0 and do not multiply the laplacian by -1, just before calling arpack original comment: https://github.com/scikit-learn/scikit-learn/pull/14647#issuecomment-521304431
@amueller what do you think?
After reading the issues and PRs referenced here and also studying the spectral clustering implementation, I understand that this enhancement’s goal is to refactor the logic for chosing an eigen solver in function spectral_embedding() located in manifold/spectral_embedding.py I opened a [WIP ] PR #15136, where I show an inital suggestion for refactoring spectral_embedding(). My intention is just to present an idea for simplifing the logic that choses the solver. It felt easier to write code, then to write a design document. I hope to get feed-back that will focus me onwards. I also intend to add more regressinon tests, but I thought to work on the tests only after the refactoring goal becomes more clear to me.