sklearn.decomposition.DictionaryLearning Example code
See original GitHub issueDescribe the issue linked to the documentation
Hello I am trying to use the example code in sklearn.decomposition.DictionaryLearning. The code is in the following link: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.DictionaryLearning.html#sklearn.decomposition.DictionaryLearning
In detail the first step is to used the function make_sparse_coded_signal to produce a signal, i.e., X.
X, dictionary, code = make_sparse_coded_signal(
... n_samples=100, n_components=15, n_features=20, n_nonzero_coefs=10, ... random_state=42, ...
)
print(X.shape)
(20, 100)
After this, a dictionary learner is defined as:
dict_learner = DictionaryLearning(
... n_components=15, transform_algorithm='lasso_lars', random_state=42, ...
)
And it is applied to the previously defined data as:
X_transformed = dict_learner.fit_transform(X)
The resulting X_transformed has shape: (20, 15). Although the function dict_learner.fit_transform, based on the documentation, should return a matrix (n_samples, n_features_new) and take an input a matrix (n_samples, n_features). However the input matrix here has shape (n_features,n_samples) and the resulting matrix has shape (n_features, n_components).
To avoid this behaviour I adde a X = X.transpose() before applying the dict_learner.fit_transform(X)` command. Thus, the input is the form (n_samples, n_features) and the output (n_samples, n_components).
I do not know if I am missing something, or I understand something wrong. I hope that I do not miss something. Thank you very much for you support and for your time.
Steps/Code to Reproduce
import numpy as np
from sklearn.datasets import make_sparse_coded_signal
from sklearn.decomposition import DictionaryLearning
X, dictionary, code = make_sparse_coded_signal(
n_samples=100, n_components=15, n_features=20, n_nonzero_coefs=10,
random_state=42,
)
dict_learner = DictionaryLearning(
n_components=15, transform_algorithm='lasso_lars', random_state=42,
)
X_transformed = dict_learner.fit_transform(X)
Expected Results
Input: (100, 20) -> (example,features)
Output: (100,15) -> (examples,components)
Actual Results
Input: (20, 100) -> (features,examples)
Output: (20,15) -> (features,components)
Suggest a potential alternative/fix
import numpy as np
from sklearn.datasets import make_sparse_coded_signal
from sklearn.decomposition import DictionaryLearning
X, dictionary, code = make_sparse_coded_signal(
n_samples=100, n_components=15, n_features=20, n_nonzero_coefs=10,
random_state=42,
)
X = X.transpose() #------------------------------------------------------------------> POTENTIAL FIX
dict_learner = DictionaryLearning(
n_components=15, transform_algorithm='lasso_lars', random_state=42,
)
X_transformed = dict_learner.fit_transform(X)
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (6 by maintainers)
Top GitHub Comments
The example is wrong right now and the PR for the transpose is not merged yet (I think I’ll finalize it btw). I think it’s better to fix the example right away
Yes, I would like to. Thank you for your quick response. Am I changing the file https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/decomposition/_dict_learning.py?