question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

validation check for precomputed gram matrix fails erroneously when using float32 data

See original GitHub issue

Describe the bug

A validation check for the precomputed gram matrix has been introduced in version 1.0.0 (https://github.com/scikit-learn/scikit-learn/pull/19004).

This check sometimes misleadingly fails when the matrix has dtype float32 and the arbitrary selected feature columns are sparse.

Code snippet to reproduce attached.

I could add a pr in the following days to fix that if wanted.

Steps/Code to Reproduce


from sklearn.linear_model import LassoCV
import numpy as np

m = LassoCV()

np.random.seed(seed=3)

X = np.random.random((10000, 50)).astype(np.float32)
X[:, 25] = np.where(X[:, 25] < 0.98, 0, 1)
X[:, 26] = np.where(X[:, 26] < 0.98, 0, 1)
y = np.random.random((10000, 1)).astype(np.float32)

m.fit(X, y)

Expected Results

No Exception thrown

Actual Results

ValueError: Gram matrix passed in via ‘precompute’ parameter did not pass validation when a single element was checked - please check that it was computed properly. For element (25,26) we computed -0.4163646101951599 but the user-supplied value was -0.41635191440582275.

Versions

1.0.1

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
derHeinzercommented, Dec 17, 2021

I created a pr to fix this issue: https://github.com/scikit-learn/scikit-learn/pull/22008 The check of gram matrix is unnecessary in case it has not been provided by the user, but calculated by coordinate descent itself. Additionally it would make sense to increase tolerance values when the dtype of the matrices are float32.

0reactions
agramfortcommented, May 28, 2022

@QuantHao #22208 is stalled as there is no test added. Can you take over the PR and add the necessary tests so we can consider merging? 🙏

Read more comments on GitHub >

github_iconTop Results From Across the Web

Fitting an Elastic Net with a precomputed Gram Matrix and ...
The following example shows how to precompute the gram matrix while using weighted samples with an ElasticNet. If weighted samples are used, the...
Read more >
Nested cross-validation in grid search for precomputed ...
The scikit learn doc says: Set kernel='precomputed' and pass the Gram matrix instead of X in the fit method. At the moment, the...
Read more >
Source code for econml.sklearn_extensions.linear_model
For linear models, weights are applied as reweighting of the data matrix X and ... Whether to use a precomputed Gram matrix to...
Read more >
Support Vector Machines
Define the kernel by either giving the kernel as a python function or by precomputing the Gram matrix. Args: X1: array X2: array...
Read more >
scikit-learn 0.16.1 documentation
Scalable approximate nearest neighbors search with Locality-sensitive ... Improved error messages and better validation when using malformed input data.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found