Fitting TransformedTargetRegressor with sample_weight in Pipeline
See original GitHub issueDescription
Can’t fit a TransformedTargetRegressor
using sample_weight
. May be link to #10945 ?
Steps/Code to Reproduce
Example:
import pandas as pd
import numpy as np
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import RobustScaler, OneHotEncoder
from sklearn.compose import TransformedTargetRegressor, ColumnTransformer, make_column_transformer
from sklearn.ensemble import RandomForestRegressor
from sklearn.datasets import make_regression
# Create dataset
X, y = make_regression(n_samples=10000, noise=100, n_features=10, random_state=2019)
y = np.exp((y + abs(y.min())) / 200)
w = np.random.randn(len(X))
cat_list = ['AA', 'BB', 'CC', 'DD']
cat = np.random.choice(cat_list, len(X), p=[0.3, 0.2, 0.2, 0.3])
df = pd.DataFrame(X, columns=["col_" + str(i) for i in range(1, 11)])
df['sample_weight'] = w
df['my_caterogy'] = cat
df.head()
use_col = [col for col in df.columns if col not in ['sample_weight']]
numerical_features = df[use_col].dtypes == 'float'
categorical_features = ~numerical_features
categorical_transformer = Pipeline(steps=[
('onehot', OneHotEncoder(handle_unknown='ignore'))])
preprocess = make_column_transformer(
(RobustScaler(), numerical_features),
(OneHotEncoder(sparse=False), categorical_features)
)
rf = RandomForestRegressor(n_estimators=20)
clf = Pipeline(steps=[
('preprocess', preprocess),
('model', rf)
])
clf_trans = TransformedTargetRegressor(regressor=clf,
func=np.log1p,
inverse_func=np.expm1)
# Work
clf_trans.fit(df[use_col], y)
# Fail
clf_trans.fit(df[use_col], y, sample_weight=df['sample_weight'])
Expected Results
Fitting with sample_weight
Actual Results
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-7-366d815659ba> in <module>()
----> 1 clf_trans.fit(df[use_col], y, sample_weight=df['sample_weight'])
~/anaconda3/envs/test_env/lib/python3.5/site-packages/sklearn/compose/_target.py in fit(self, X, y, sample_weight)
194 self.regressor_.fit(X, y_trans)
195 else:
--> 196 self.regressor_.fit(X, y_trans, sample_weight=sample_weight)
197
198 return self
~/anaconda3/envs/test_env/lib/python3.5/site-packages/sklearn/pipeline.py in fit(self, X, y, **fit_params)
263 This estimator
264 """
--> 265 Xt, fit_params = self._fit(X, y, **fit_params)
266 if self._final_estimator is not None:
267 self._final_estimator.fit(Xt, y, **fit_params)
~/anaconda3/envs/test_env/lib/python3.5/site-packages/sklearn/pipeline.py in _fit(self, X, y, **fit_params)
200 if step is not None)
201 for pname, pval in six.iteritems(fit_params):
--> 202 step, param = pname.split('__', 1)
203 fit_params_steps[step][param] = pval
204 Xt = X
ValueError: not enough values to unpack (expected 2, got 1)
Versions
import sklearn; sklearn.show_versions()
System:
machine: Linux-4.4.0-127-generic-x86_64-with-debian-stretch-sid
executable: /home/gillesa/anaconda3/envs/test_env/bin/python
python: 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 21:41:56) [GCC 7.3.0]
BLAS:
cblas_libs: cblas
lib_dirs:
macros:
Python deps:
sklearn: 0.20.2
pandas: 0.24.1
pip: 19.0.1
setuptools: 40.2.0
numpy: 1.16.1
Cython: None
scipy: 1.2.0
Issue Analytics
- State:
- Created 5 years ago
- Comments:17 (15 by maintainers)
Top Results From Across the Web
sklearn.compose.TransformedTargetRegressor
TransformedTargetRegressor : Poisson regression and non-normal loss Poisson ... This regressor will automatically be cloned each time prior to fitting.
Read more >How to use Custom Sklearn Classes and Pipelines
We start by defining a class that inherits from TransformerMixin which gives us the fit_transform method if we define the fit and transform ......
Read more >Is it possible to add TransformedTargetRegressor into a scikit ...
No, because the scikit-learn original Pipeline does not change the y or the number of samples in X and y during the steps....
Read more >Pipeline — Version 0.10.0 - Imbalanced-Learn
The final estimator only needs to implement fit. The transformers and samplers in the pipeline can be cached using memory argument. The purpose...
Read more >Python Examples of sklearn.linear_model.Lasso
regr.fit, X, y) # fit with sample_weight with a regressor which does not support it sample_weight ... regr = TransformedTargetRegressor(regressor=Lasso(), ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
You’re right. we don’t yet seem to properly support fit parameters in TransformedTargetRegressor. And perhaps we should…
Cool, I’ll give it a try then