Fitting a TransformedTargetRegressor to 3D target fails
See original GitHub issueDescribe the bug
I created a TransformedTargetRegressor
with a transformer that reshape a 3D array into a 2D array, and a MultiOutputRegressor
as the regressor. When I call fit
, it throws an error that Found array with dim 3. Estimator expected <= 2.
. 3D array here should be fine, as the transformer would transform it into a 2D array.
Steps/Code to Reproduce
from sklearn.compose import TransformedTargetRegressor
from sklearn.multioutput import MultiOutputRegressor
from sklearn.preprocessing import FunctionTransformer
from sklearn.linear_model import LinearRegression
import numpy as np
X = np.arange(100).reshape(10, 10)
y = np.arange(60).reshape(10, 3, 2)
def flatten_coords(coords):
return coords.reshape(coords.shape[0], -1)
def unflatten_coords(coords):
return coords.reshape(coords.shape[0], -1, 2)
coords_flattener = FunctionTransformer(flatten_coords, unflatten_coords)
model = TransformedTargetRegressor(
regressor=MultiOutputRegressor(LinearRegression()),
transformer=coords_flattener
)
model.fit(X, y)
Expected Results
No error is thrown
Actual Results
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-10-a584e7c9bc04> in <module>
18 transformer=coords_flattener
19 )
---> 20 model.fit(X, y)
~/.pyenv/versions/3.7.4/envs/unspun_analysis/lib/python3.7/site-packages/sklearn/compose/_target.py in fit(self, X, y, **fit_params)
177 """
178 y = check_array(y, accept_sparse=False, force_all_finite=True,
--> 179 ensure_2d=False, dtype='numeric')
180
181 # store the number of dimension of the target to predict an array of
~/.pyenv/versions/3.7.4/envs/unspun_analysis/lib/python3.7/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
70 FutureWarning)
71 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 72 return f(**kwargs)
73 return inner_f
74
~/.pyenv/versions/3.7.4/envs/unspun_analysis/lib/python3.7/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
639 if not allow_nd and array.ndim >= 3:
640 raise ValueError("Found array with dim %d. %s expected <= 2."
--> 641 % (array.ndim, estimator_name))
642
643 if force_all_finite:
ValueError: Found array with dim 3. Estimator expected <= 2.
Versions
System:
python: 3.7.4 (default, Feb 25 2020, 10:49:46) [Clang 10.0.1 (clang-1001.0.46.4)]
executable: /Users/unspun/.pyenv/versions/3.7.4/envs/unspun_analysis/bin/python
machine: Darwin-18.6.0-x86_64-i386-64bit
Python dependencies:
pip: 20.2
setuptools: 46.1.3
sklearn: 0.23.2
numpy: 1.19.4
scipy: 1.4.1
Cython: 0.29.13
pandas: 1.1.0
matplotlib: 3.0.3
joblib: 0.14.0
threadpoolctl: 2.1.0
Built with OpenMP: True
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (8 by maintainers)
Top Results From Across the Web
sklearn.compose.TransformedTargetRegressor
Meta-estimator to regress on a transformed target. Useful for applying a non-linear transformation to the target y in regression problems. This transformation ...
Read more >TransformedTargetForecaster — sktime documentation
Meta-estimator for forecasting transformed time series. Pipeline functionality to apply transformers to the target series. The X data is not transformed. If you ......
Read more >How to improve the accuracy of a Regression Model
This is a regression problem as our target variable — Charges/insurance cost — is numeric. Let's begin by loading the dataset and exploring ......
Read more >TransformedTargetRegressor save and load error
The problem is when you try to load the file back it cannot resolve transform_targets which was not dumped initially.
Read more >scikit-learn - Gitter
For example if we have an array of 3d sphere coordinate and another array for corresponding radii and we need to find nearest...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Yes, I think we can remove the
check_array
call in the https://github.com/scikit-learn/scikit-learn/blob/0fb307bf39bbdacd6ed713c00724f8f871d60370/sklearn/compose/_target.py#L178 X validation will be done by the regressor anyway, we never use X inTransformedTargetRegressor.fit
and the performance cost is not zero either.Hi @panangam thanks for pinging. I’m sorry to say that the only thing needed on your side is a bit of patience… scikit-learn is really understaffed on reviewers having the right to approve… I’ve labeled the PR as “Waiting for Reviewer” for now. Thanks for your understanding.