MultiOutputRegressor: Support for more fit parameters
See original GitHub issueDescription
This is a feature wanted. Till latest version of sklearn
, the MultiOutputRegressor.fit
only support a optional sample_weight
parameter. It would be nice if it support another optional fit_param
parameter, which will enhance the estimator.fit
. For example, we can use lightgbm
or xgboost
early stopping fitting way to overcome the over-fitting issue.
I know it is a little bit complicated to realize that. But I I hope you will consider that. Thanks!
Steps/Code to Reproduce
This is my expected usage example.
#!/usr/bin/env python3
import numpy as np
from sklearn.multioutput import MultiOutputRegressor
import lightgbm as lgb
train_X = np.random.random((10, 10))
train_y = np.random.random((10, 4))
eval_X = np.random.random((10, 10))
train_y = np.random.random((10, 4))
single_model = lgb.GBMRegressor()
model = MultiOutputRegressor(single_model)
fit_param = {'verbose': False, 'early_stopping_rounds':10, 'eval_set':(eval_X, eval_y)}
reg.fit(train_X, train_y, fit_param=fit_param)
Expected Results
Unsupported yet.
Actual Results
Unsupported yet.
Versions
Scikit-Learn: 0.22
pltaform: Windows-10-10.0.14393-SP0
python: 3.6.9
Issue Analytics
- State:
- Created 4 years ago
- Comments:10 (6 by maintainers)
Top Results From Across the Web
sklearn.multioutput.MultiOutputRegressor
This is a simple strategy for extending regressors that do not natively support multi-target regression. New in version 0.18. Parameters: estimatorestimator ...
Read more >How to Develop Multi-Output Regression Models with Python
Multioutput regression are regression problems that involve predicting two or more numerical values given an input example.
Read more >Multi-output Regression Example with MultiOutputRegressor ...
In this tutorial, we'll learn how to fit and predict multioutput regression data with scikit-learn's MultiOutputRegressor class.
Read more >Multi-Output Regression using Sklearn - Python-bloggers
Regression analysis is a process of building a linear or non-linear fit for one or more continuous target variables.
Read more >Tutorial:Multi-Output Regression (skorch, tune) | Kaggle
Linear regression (Scikit-learn); Support vector machine (Scikit-learn) ... Besides, it is known that scaling is often more stable in parameter optimization ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi! I believe the current implementation still does not support passing the
eval_set
for early stopping (at least for XGBoost). The problem is that the feature matrices and targets provided byeval_set
are never propagated in the chain: the matrices are never augmented, and the targets (which are 2D matrices theirselves, since it’s a chain) are never split into single column vectors to be passed to thefit
method.For example:
Result:
As you can see, the target of
eval_set
(eval_y
) is passed to XGBoost as a 2D matrix, which is not allowed. Even if you fix the problem foreval_y
, the feature matrix ofeval_set
(eval_X
) is not augmented when traversing the chain, raising an error as well in the next iteration.Versions: XGBoost: 1.4.2 SKlearn: 0.24.2
In this case,
RegressorChain
orMultiOutputRegressor
does not know which fit parameters to slice.To be fully generic, we would need to accept a
process_fit_params
callable parameter inRegressorChain
orMultiOutputRegressor
. Duringfit
, the indices to slice on andfit_params
is passed in and the callable returns the new fit params with the data correctly sliced.