spectral_embedding tests fail on 64-bit Little Endian PowerPC (ppc64le)
See original GitHub issueOn a Debian buildd host for 64-bit Little Endian PowerPC (ppc64le, which Debian calls ppc64el), three tests (two of them recently added with commit https://github.com/scikit-learn/scikit-learn/commit/e52e9c8d7536b6315da655164951060642a52707) involving spectral_embedding()
fail:
sklearn/cluster/tests/test_spectral.py::test_precomputed_nearest_neighbors_filtering FAILED
sklearn/manifold/tests/test_spectral_embedding.py::test_precomputed_nearest_neighbors_filtering FAILED
sklearn/tests/test_common.py::test_estimators[SpectralEmbedding()-check_pipeline_consistency] FAILED
In contrast, on 64-bit Big Endian PowerPC (ppc64) all of the tests pass (build log), so is this perhaps an endianness issue?
From the build log on the failing ppc64el:
=================================== FAILURES ===================================
_________________ test_precomputed_nearest_neighbors_filtering _________________
def test_precomputed_nearest_neighbors_filtering():
# Test precomputed graph filtering when containing too many neighbors
X, y = make_blobs(n_samples=200, random_state=0,
centers=[[1, 1], [-1, -1]], cluster_std=0.01)
n_neighbors = 2
results = []
for additional_neighbors in [0, 10]:
nn = NearestNeighbors(
n_neighbors=n_neighbors + additional_neighbors).fit(X)
graph = nn.kneighbors_graph(X, mode='connectivity')
labels = SpectralClustering(random_state=0, n_clusters=2,
affinity='precomputed_nearest_neighbors',
n_neighbors=n_neighbors).fit(graph).labels_
results.append(labels)
> assert_array_equal(results[0], results[1])
E AssertionError:
E Arrays are not equal
E
E Mismatch: 49.5%
E Max absolute difference: 1
E Max relative difference: 1.
E x: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
E 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
E 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
E y: array([1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1,
E 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0,
E 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0,...
sklearn/cluster/tests/test_spectral.py:122: AssertionError
_________________ test_precomputed_nearest_neighbors_filtering _________________
def test_precomputed_nearest_neighbors_filtering():
# Test precomputed graph filtering when containing too many neighbors
n_neighbors = 2
results = []
for additional_neighbors in [0, 10]:
nn = NearestNeighbors(
n_neighbors=n_neighbors + additional_neighbors).fit(S)
graph = nn.kneighbors_graph(S, mode='connectivity')
embedding = SpectralEmbedding(random_state=0, n_components=2,
affinity='precomputed_nearest_neighbors',
n_neighbors=n_neighbors
).fit(graph).embedding_
results.append(embedding)
> assert_array_equal(results[0], results[1])
E AssertionError:
E Arrays are not equal
E
E Mismatch: 100%
E Max absolute difference: 0.23411947
E Max relative difference: 763.22760149
E x: array([[-0.030262, 0.035582],
E [ 0.032586, -0.007568],
E [-0.033157, 0.033655],...
E y: array([[ 0.040137, -0.037808],
E [-0.004603, -0.024269],
E [ 0.032301, -0.00754 ],...
sklearn/manifold/tests/test_spectral_embedding.py:159: AssertionError
_______ test_estimators[SpectralEmbedding()-check_pipeline_consistency] ________
estimator = SpectralEmbedding(affinity='nearest_neighbors', eigen_solver=None, gamma=None,
n_components=2, n_jobs=None, n_neighbors=None,
random_state=None)
check = functools.partial(<function check_pipeline_consistency at 0x7fff84bd4ee0>, 'SpectralEmbedding')
@parametrize_with_checks(_tested_estimators())
def test_estimators(estimator, check):
# Common tests for estimator instances
with ignore_warnings(category=(FutureWarning,
ConvergenceWarning,
UserWarning, FutureWarning)):
_set_checking_parameters(estimator)
> check(estimator)
sklearn/tests/test_common.py:101:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
sklearn/utils/_testing.py:327: in wrapper
return fn(*args, **kwargs)
sklearn/utils/estimator_checks.py:1285: in check_pipeline_consistency
assert_allclose_dense_sparse(result, result_pipe)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
x = array([[ 9.81790160e-02, -1.68785957e-15],
[ 1.59037212e-01, -9.43588019e-02],
[ 1.59037212e-01, 2.6359....81790160e-02, 2.94383302e-16],
[ 9.81790160e-02, 3.86709855e-16],
[ 1.59037212e-01, -2.12784505e-01]])
y = array([[-1.34083137e-01, -1.51153284e-15],
[ 6.32880953e-02, -9.43588019e-02],
[ 6.32880953e-02, 2.6359....34083137e-01, -4.71346606e-16],
[-1.34083137e-01, 3.63755266e-16],
[ 6.32880953e-02, -2.12784505e-01]])
rtol = 1e-07, atol = 1e-09, err_msg = ''
def assert_allclose_dense_sparse(x, y, rtol=1e-07, atol=1e-9, err_msg=''):
"""Assert allclose for sparse and dense data.
Both x and y need to be either sparse or dense, they
can't be mixed.
Parameters
----------
x : array-like or sparse matrix
First array to compare.
y : array-like or sparse matrix
Second array to compare.
rtol : float, optional
relative tolerance; see numpy.allclose
atol : float, optional
absolute tolerance; see numpy.allclose. Note that the default here is
more tolerant than the default for numpy.testing.assert_allclose, where
atol=0.
err_msg : string, default=''
Error message to raise.
"""
if sp.sparse.issparse(x) and sp.sparse.issparse(y):
x = x.tocsr()
y = y.tocsr()
x.sum_duplicates()
y.sum_duplicates()
assert_array_equal(x.indices, y.indices, err_msg=err_msg)
assert_array_equal(x.indptr, y.indptr, err_msg=err_msg)
assert_allclose(x.data, y.data, rtol=rtol, atol=atol, err_msg=err_msg)
elif not sp.sparse.issparse(x) and not sp.sparse.issparse(y):
# both dense
> assert_allclose(x, y, rtol=rtol, atol=atol, err_msg=err_msg)
E AssertionError:
E Not equal to tolerance rtol=1e-07, atol=1e-09
E
E Mismatch: 50%
E Max absolute difference: 0.31393279
E Max relative difference: 7.46490707
E x: array([[ 9.817902e-02, -1.687860e-15],
E [ 1.590372e-01, -9.435880e-02],
E [ 1.590372e-01, 2.635916e-01],...
E y: array([[-1.340831e-01, -1.511533e-15],
E [ 6.328810e-02, -9.435880e-02],
E [ 6.328810e-02, 2.635916e-01],...
From the same build log, here is the output of show_versions():
System:
python: 3.8.2rc1 (default, Feb 11 2020, 15:26:48) [GCC 9.2.1 20200203]
executable: /usr/bin/python3.8
machine: Linux-4.19.0-8-powerpc64le-ppc64le-with-glibc2.29
Python dependencies:
pip: None
setuptools: 44.0.0
sklearn: 0.22.1
numpy: 1.17.4
scipy: 1.3.3
Cython: 0.29.14
pandas: 0.25.3
matplotlib: 3.1.2
joblib: 0.14.0
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (8 by maintainers)
Top Results From Across the Web
No results found
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Well, the easy part is confirming that it’s an issue with our scipy build:
Thanks a lot, @ckastner for the feedback. One can close for now.