[ENH] ICA based on AJD (SOBI, AMUSE, NSS)
See original GitHub issueWhile developing an extension of ICA to grouped data (as opposed to simply pooling data), we (@NiklasPfister & myself) also implemented a variety of related linear ICA algorithms. They are based on approximate joint matrix diagonalization and, as opposed to FastICA, do not rely on non-Gaussianity of the sources. In particular, the following ICA-algorithms are implemented for the right choice of parameters by our UwedgeICA sklearn transformer:
- NSS-JS, NSS-TD-JD (NSS = non-stationarity in time or space) as well as
- the popular SOBI (second-order) and its predecessor AMUSE
(references can be found in the paper below). Additionally, the CoroICA transformer implements confounding-robust extensions of those aforementioned methods.
- Code
- JMLR Paper
- Website
- Documentation
- We are attaching a MWE below.
We guess it should be quite doable to include those in sklearn as our current implementation is already sklearn flavoured and built upon the respective mixins, PEP8 compatible, and comes with examples and documentation. Also the sklearn examples available for FastICA could be easily adapted and extended to demo the different ICA variants.
We are raising the issue to get early feedback and gauge the interest — what do you think?
–
Get current implementation: pip install coroica
.
from coroica import UwedgeICA
from coroica.utils import md_index
import numpy as np
from sklearn.decomposition import FastICA
np.random.seed(0)
# generate 2 sources with changing variance
S = np.random.randn(1000, 2) \
* np.c_[np.sin(np.arange(1000) / 1000),
np.cos(np.arange(1000) / 1000)]
A = np.random.randn(2, 2)
X = (A.dot(S.T)).T
# transformers
fastica = FastICA(random_state=0)
uwedgeica = UwedgeICA(partitionsize=50)
# fit both ICAs
fastica.fit(X)
uwedgeica.fit(X)
print(f'MD(A,V_fastica) = {md_index(A, fastica.components_):.2f}')
print(f'MD(A,V_uwedge) = {md_index(A, uwedgeica.V_):.2f}')
# yields:
# MD(A,V_fastica) = 0.04
# MD(A,V_uwedge) = 0.02
Issue Analytics
- State:
- Created 4 years ago
- Comments:12 (9 by maintainers)
Thanks for your input. We agree that SOBI is the most widely used and hence relevant addition to scikit-learn. In our current implementation all of the 4 mentioned ICAs are realized by the exact same code base but only with different parameter settings (this is because all methods jointly diagonalize different (auto)-covariance matrices). We would therefore propose to do the following:
If that sounds good to you we would start with a WIP-PR.
I’m excited. Thank you!