Add n_components_ to SparsePCA
See original GitHub issuePCA allows to retrieve the number of components with n_components_ attribute; this is, however, not possible with SparsePCA (both PCA and SparsePCA accept n_components argument).
Would it make sense to enable accessing n_components_ on PCA too? Please, note that this would be different from n_components, which is already available, but represents an unprocessed input argument, i.e. None if nothing was passed).
The current PCA behaviour:
from sklearn.decomposition import PCA, SparsePCA
from sklearn import datasets
iris = datasets.load_iris()
pca = PCA()
pca.fit(iris.data)
assert pca.n_components_ == 4
assert pca.n_components == None
assert len(pca.components_) == 4
pca_3 = PCA(n_components=3)
pca_3.fit(iris.data)
assert pca_3.n_components_ == 3
assert pca_3.n_components == 3
assert len(pca_3.components_) == 3
Existing SparsePCA behaviour:
spca = SparsePCA()
spca.fit(iris.data)
assert spca.n_components == None
assert len(spca.components_) == 4
spca_3 = SparsePCA(n_components=3)
spca_3.fit(iris.data)
assert spca_3.n_components == 3
assert len(spca_3.components_) == 3
Proposed SparsePCA behaviour:
assert spca.n_components_ == 4
assert spca_3.n_components_ == 3
This could also be added to KernelPCA and other PCA methods. Implementation-wise the code for calculating the number of components PCA could be generalised (this is replacing None with the actual number and/or trimming by the number of features or samples; I think that it might be placed _BasePCA, but actually neither SparsePCA nor KernelPCA descends from it). Is this the right direction?
On a related note, would make sense to have a computed property name n_non_trivial_components_ to give the number of components which have non-zero loadings?
Edit: a simple workaround is to use len(spca.components_), which works equally well for sparse and dense PCA - I am not sure of the addition of n_components_ is needed, but the point is that it would be great to have a consistent interface for all PCA methods!
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (4 by maintainers)

Top Related StackOverflow Question
Thanks for the detailed proposition. Adding an attribute
n_components_seems reasonable. Implementation withlen(spca.components_)seems straightforward. An ideal pull-request would have a small test, an entry indoc/whats_new/v0.23.rst, and an update of the docstring.Do you want to open a pull-request ?
fixed in #16981