question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

add details on how to use both `sample_weight` and `precompute` together for linear models

See original GitHub issue

Describe the issue linked to the documentation

Currently it is unclear from the documentation on how the sample_weight argument to fit() interacts with precompute in the case that the user wants to pass in a precomputed Gram matrix. When these two arguments are used together it requires carefully preprocessing the data to replicate the steps performed in _pre_fit.

Here is a snippet of code demonstrating how to do it:

from sklearn.linear_model import ElasticNet
from sklearn.datasets import make_regression
from numpy.testing import assert_almost_equal
import numpy as np

X, y = make_regression(n_samples=int(1e5), noise=0.5)

# random lognormal weight vector.
weights = np.random.lognormal(size=y.shape)

en = ElasticNet(alpha=0.01, fit_intercept=True, normalize=False, precompute=False)
en.fit(X, y, sample_weight=weights)

X_c = (X - np.average(X, axis=0, weights=weights))
# row wise multiply
X_r = X_c * np.sqrt(weights)[:, np.newaxis]

en_precompute = ElasticNet(alpha=0.01, fit_intercept=True, normalize=False, precompute=X_r.T@X_r)
en_precompute.fit(X_c, y, sample_weight=weights)

assert_almost_equal(en.coef_, en_precompute.coef_)

Suggest a potential alternative/fix

Perhaps a section could be added to the user guide (suggested by @ogrisel on Gitter) on how to use these features together, and then that could be referenced from the docstring of the various models that take a precompute parameter in their constructors. @ogrisel also suggested adding a unit test (perhaps adapted from the above snippet) to make sure that this way of combining the two features isn’t inadvertently broken in the future.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
agramfortcommented, Dec 9, 2020

awesome @agramfort, I’m happy to turn this in to an example - would this go in this directory: https://github.com/scikit-learn/scikit-learn/tree/master/examples/miscellaneous?

ok in linear models

re: checking an element of the matrix, I guess I’d be a bit worried that it would give a false sense of security without being really guaranteed to catch a user error.

it’s a trade off.

0reactions
cmarmocommented, Jan 18, 2021

Fixed by #19004.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Linear models with weighted observations | R-bloggers
We are analyzing data in an aggregated form such that the weight variable encodes how many original observations each row in the aggregated...
Read more >
sklearn.linear_model.LinearRegression
Predict using the linear model. score (X, y[, sample_weight]). Return the coefficient of determination of the prediction.
Read more >
Weighted Generalized Linear Models - Statsmodels
In the following we combine observations in two ways, first we combine observations that have values for all variables identical, and secondly we...
Read more >
Weighted Linear Regression - Towards Data Science
This expression shows that weighted linear regression uses different weights for each observation based on their variance. If an observation has ...
Read more >
Linear Regression in R | A Step-by-Step Guide & Examples
To perform linear regression in R, there are 6 main steps. Use our sample data and code to perform simple or multiple regression....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found