GradientBoostingRegressor with huber loss sometimes fails with an `IndexError: cannot do a non-empty take from an empty axes.`
See original GitHub issueIf I use the first 63726 lines of my dataset for training everything works, but if I add one line more to the training set, then I get the error. My dataset contains no NaNs. The 63727th line has no obvious differences from the others, furthermore if I use lines 63000:64000 for training then I don’t get the error, which suggests that the content of line 63727 isn’t directly the problem.
I have tried and failed to make a small reproducible test, so I hope someone can make sense of what is happening here and why.
Versions
Windows-2008ServerR2-6.1.7601-SP1 Python 3.6.3 |Intel Corporation| (default, Oct 17 2017, 23:26:12) [MSC v.1900 64 bit (AMD64)] NumPy 1.13.3 SciPy 0.19.1 Scikit-Learn 0.19.0
Error message which I got when using RandomSearchCV
Sub-process traceback:
---------------------------------------------------------------------------
IndexError Thu Dec 7 19:08:00 2017
PID: 11688 Python 3.6.3: C:\ProgramData\Anaconda3\envs\i3\python.exe
...........................................................................
C:\ProgramData\Anaconda3\envs\i3\lib\site-packages\sklearn\externals\joblib\parallel.py in __call__(self=<sklearn.externals.joblib.parallel.BatchedCalls object>)
126 def __init__(self, iterator_slice):
127 self.items = list(iterator_slice)
128 self._size = len(self.items)
129
130 def __call__(self):
--> 131 return [func(*args, **kwargs) for func, args, kwargs in self.items]
self.items = [(<function _fit_and_score>, (GradientBoostingRegressor(alpha=0.9, criterion='...le=1.0, verbose=0, warm_start=False), memmap([[ -1.00000000e+00, -2.00000003e-01, -1 ....14299998e-02, 1.42734203e+01]], dtype=float32), memmap([ 27., 35., -78., ..., -19., -9., -4.], type=float32), {'score': <function MaxWinRate>}, array([ 0, 1, 2, ..., 12997, 12998, 12999]), memmap([ 13000, 13001, 13002, ..., 618893, 618894, 618895]), 10, {'learning_rate': 0.28522087352060566, 'max_depth': 16, 'max_features': 0.32058686573551248, 'min_samples_leaf': 1, 'n_estimators': 327}), {'error_score': 'raise', 'fit_params': {}, 'return_n_test_samples': True, 'return_parameters': False, 'return_times': True, 'return_train_score': True})]
132
133 def __len__(self):
134 return self._size
135
...........................................................................
C:\ProgramData\Anaconda3\envs\i3\lib\site-packages\sklearn\externals\joblib\parallel.py in <listcomp>(.0=<list_iterator object>)
126 def __init__(self, iterator_slice):
127 self.items = list(iterator_slice)
128 self._size = len(self.items)
129
130 def __call__(self):
--> 131 return [func(*args, **kwargs) for func, args, kwargs in self.items]
func = <function _fit_and_score>
args = (GradientBoostingRegressor(alpha=0.9, criterion='...le=1.0, verbose=0, warm_start=False), memmap([[ -1.00000000e+00, -2.00000003e-01, -1 ....14299998e-02, 1.42734203e+01]], dtype=float32), memmap([ 27., 35., -78., ..., -19., -9., -4.], dtype=float32), {'score': <function MaxWinRate>}, array([ 0, 1, 2, ..., 12997, 12998, 12999]), memmap([ 13000, 13001, 13002, ..., 618893, 618894, 618895]), 10, {'learning_rate': 0.28522087352060566, 'max_depth': 16, 'max_features': 0.32058686573551248, 'min_samples_leaf': 1, 'n_estimators': 327})
kwargs = {'error_score': 'raise', 'fit_params': {}, 'return_n_test_samples': True, 'return_parameters': False, 'return_times': True, 'return_train_score': True}
132
133 def __len__(self):
134 return self._size
135
...........................................................................
C:\ProgramData\Anaconda3\envs\i3\lib\site-packages\sklearn\model_selection\_validation.py in _fit_and_score(estimator=GradientBoostingRegressor(alpha=0.9, criterion='...le=1.0, verbose=0, warm_start=False), X=memmap([[ -1.00000000e+00, -2.00000003e-01, -1....14299998e-02, 1.42734203e+01]], dtype=float32), y=memmap([ 27., 35., -78., ..., -19., -9., -4.], dtype=float32), scorer={'score': <function MaxWinRate>}, train=array([ 0, 1, 2, ..., 12997, 12998, 12999]), test=memmap([ 13000, 13001, 13002, ..., 618893, 618894, 618895]), verbose=10, parameters={'learning_rate': 0.28522087352060566, 'max_depth': 16, 'max_features': 0.32058686573551248, 'min_samples_leaf': 1, 'n_estimators': 327}, fit_params={}, return_train_score=True, return_parameters=False, return_n_test_samples=True, return_times=True, error_score='raise')
432
433 try:
434 if y_train is None:
435 estimator.fit(X_train, **fit_params)
436 else:
--> 437 estimator.fit(X_train, y_train, **fit_params)
estimator.fit = <bound method BaseGradientBoosting.fit of Gradie...e=1.0, verbose=0, warm_start=False)>
X_train = memmap([[ -1.00000000e+00, -2.00000003e-01, -1....04051296e+09, 1.89901295e+01]], dtype=float32)
y_train = memmap([ 27., 35., -78., ..., 9., 21., 20.], dtype=float32)
fit_params = {}
438
439 except Exception as e:
440 # Note fit time as time until error
441 fit_time = time.time() - start_time
...........................................................................
C:\ProgramData\Anaconda3\envs\i3\lib\site-packages\sklearn\ensemble\gradient_boosting.py in fit(self=GradientBoostingRegressor(alpha=0.9, criterion='...le=1.0, verbose=0, warm_start=False), X=array([[ -1.00000000e+00, -2.00000003e-01, -1.....04051296e+09, 1.89901295e+01]], dtype=float32), y=memmap([ 27., 35., -78., ..., 9., 21., 20.], dtype=float32), sample_weight=array([ 1., 1., 1., ..., 1., 1., 1.], dtype=float32), monitor=None)
1029 X_idx_sorted = np.asfortranarray(np.argsort(X, axis=0),
1030 dtype=np.int32)
1031
1032 # fit the boosting stages
1033 n_stages = self._fit_stages(X, y, y_pred, sample_weight, random_state,
-> 1034 begin_at_stage, monitor, X_idx_sorted)
begin_at_stage = 0
monitor = None
X_idx_sorted = array([[ 0, 8619, 0, ..., 859, 869, ...[ 6499, 7945, 6499, ..., 11039, 10053, 8487]])
1035 # change shape of arrays after fit (early-stopping or additional ests)
1036 if n_stages != self.estimators_.shape[0]:
1037 self.estimators_ = self.estimators_[:n_stages]
1038 self.train_score_ = self.train_score_[:n_stages]
...........................................................................
C:\ProgramData\Anaconda3\envs\i3\lib\site-packages\sklearn\ensemble\gradient_boosting.py in _fit_stages(self=GradientBoostingRegressor(alpha=0.9, criterion='...le=1.0, verbose=0, warm_start=False), X=array([[ -1.00000000e+00, -2.00000003e-01, -1.....04051296e+09, 1.89901295e+01]], dtype=float32), y=memmap([ 27., 35., -78., ..., 9., 21., 20.], dtype=float32), y_pred=array([[ 26.99980487], [ 34.99961777], ...], [ 21.00002664], [ 19.99974011]]), sample_weight=array([ 1., 1., 1., ..., 1., 1., 1.], dtype=float32), random_state=<mtrand.RandomState object>, begin_at_stage=0, mo
nitor=None, X_idx_sorted=array([[ 0, 8619, 0, ..., 859, 869, ...[ 6499, 7945, 6499, ..., 11039, 10053, 8487]]))
1084 sample_weight[~sample_mask])
1085
1086 # fit next stage of trees
1087 y_pred = self._fit_stage(i, X, y, y_pred, sample_weight,
1088 sample_mask, random_state, X_idx_sorted,
-> 1089 X_csc, X_csr)
X_csc = None
X_csr = None
1090
1091 # track deviance (= loss)
1092 if do_oob:
1093 self.train_score_[i] = loss_(y[sample_mask],
...........................................................................
C:\ProgramData\Anaconda3\envs\i3\lib\site-packages\sklearn\ensemble\gradient_boosting.py in _fit_stage(self=GradientBoostingRegressor(alpha=0.9, criterion='...le=1.0, verbose=0, warm_start=False), i=122, X=array([[ -1.00000000e+00, -2.00000003e-01, -1.....04051296e+09, 1.89901295e+01]], dtype=float32), y=memmap([ 27., 35., -78., ..., 9., 21., 20.], dtype=float32), y_pred=array([[ 26.99980487], [ 34.99961777], ...], [ 21.00002664], [ 19.99974011]]), sample_weight=array([ 1., 1., 1., ..., 1., 1., 1.], dtype=float32), sample_mask=array([ True, True, True, ..., True, True, True], dtype=bool), random_state=<mtrand.RandomState object>, X_idx_sorted=array([[ 0, 8619, 0, ..., 859, 869, ...[ 6499, 7945, 6499, ..., 11039, 10
053, 8487]]), X_csc=None, X_csr=None)
793 sample_weight, sample_mask,
794 self.learning_rate, k=k)
795 else:
796 loss.update_terminal_regions(tree.tree_, X, y, residual, y_pred,
797 sample_weight, sample_mask,
--> 798 self.learning_rate, k=k)
self.learning_rate = 0.28522087352060566
k = 0
799
800 # add tree to ensemble
801 self.estimators_[i, k] = tree
802
...........................................................................
C:\ProgramData\Anaconda3\envs\i3\lib\site-packages\sklearn\ensemble\gradient_boosting.py in update_terminal_regions(self=<sklearn.ensemble.gradient_boosting.HuberLossFunction object>, tree=<sklearn.tree._tree.Tree object>, X=array([[ -1.00000000e+00, -2.00000003e-01, -1.....04051296e+09, 1.89901295e+01]], dtype=float32), y=memmap([ 27., 35., -78., ..., 9., 21., 20.], dtype=float32), residual=array([ 1.95126553e-04, 3.82230297e-04, 9.5...1895189e-06, -2.66392852e-05, 2.59891505e-04]), y_pred=array([[ 26.99980487], [ 34.99961777], ...], [ 21.00002664], [ 19.99974011]]), sample_weight=array([ 1., 1., 1., ..., 1., 1., 1.], dtype=float32), sample_mask=array([ True, True, True, ..., True, True, True], dtype=bool), learning_rate=0.28522087352060566, k=0)
244
245 # update each leaf (= perform line search)
246 for leaf in np.where(tree.children_left == TREE_LEAF)[0]:
247 self._update_terminal_region(tree, masked_terminal_regions,
248 leaf, X, y, residual,
--> 249 y_pred[:, k], sample_weight)
y_pred = array([[ 26.99980487], [ 34.99961777], ...], [ 21.00002664], [ 19.99974011]])
k = 0
sample_weight = array([ 1., 1., 1., ..., 1., 1., 1.], dtype=float32
)
250
251 # update predictions (both in-bag and out-of-bag)
252 y_pred[:, k] += (learning_rate
253 * tree.value[:, 0, 0].take(terminal_regions, ax
is=0))
...........................................................................
C:\ProgramData\Anaconda3\envs\i3\lib\site-packages\sklearn\ensemble\gradient_boosting.py in _update_terminal_region(self=<sklearn.ensemble.gradient_boosting.HuberLossFunction object>, tree=<sklearn.tree._tree.Tree object>, terminal_regions=array([237, 237, 86, ..., 237, 237, 237], dtype=int64), leaf=260, X=array([[ -1.00000000e+00, -2.00000003e-01, -1.....04051296e+09, 1.89901295e+01]], dtype=float32), y=memmap([ 27., 35., -78., ..., 9., 21., 20.], dtype=float32), residual=array([ 1.95126553e-04, 3.82230297e-04, 9.5...1895189e-06, -2.66392852e-05, 2.59891505e-04]), pred=array([ 26.99980487, 34.99961777, -78.03339049,... 9.00000162, 21.00002664, 19.99974011]), sample_weight=array([], dtype=float32))
385 terminal_region = np.where(terminal_regions == leaf)[0]
386 sample_weight = sample_weight.take(terminal_region, axis=0)
387 gamma = self.gamma
388 diff = (y.take(terminal_region, axis=0)
389 - pred.take(terminal_region, axis=0))
--> 390 median = _weighted_percentile(diff, sample_weight, percentile=50)
median = undefined
diff = array([], dtype=float64)
sample_weight = array([], dtype=float32)
391 diff_minus_median = diff - median
392 tree.value[leaf, 0] = median + np.mean(
393 np.sign(diff_minus_median) *
394 np.minimum(np.abs(diff_minus_median), gamma))
...........................................................................
C:\ProgramData\Anaconda3\envs\i3\lib\site-packages\sklearn\utils\stats.py in _weighted_percentile(array=array([], dtype=float64), sample_weight=array([], dtype=float32), percentile=50)
17 Compute the weighted ``percentile`` of ``array`` with ``sample_weight``.
18 """
19 sorted_idx = np.argsort(array)
20
21 # Find index of median prediction for each sample
---> 22 weight_cdf = stable_cumsum(sample_weight[sorted_idx])
weight_cdf = undefined
sample_weight = array([], dtype=float32)
sorted_idx = array([], dtype=int64)
23 percentile_idx = np.searchsorted(
24 weight_cdf, (percentile / 100.) * weight_cdf[-1])
25 return array[sorted_idx[percentile_idx]]
...........................................................................
C:\ProgramData\Anaconda3\envs\i3\lib\site-packages\sklearn\utils\extmath.py in stable_cumsum(arr=array([], dtype=float32), axis=None, rtol=1e-05, atol=1e-08)
757 if np_version < (1, 9):
758 return np.cumsum(arr, axis=axis, dtype=np.float64)
759
760 out = np.cumsum(arr, axis=axis, dtype=np.float64)
761 expected = np.sum(arr, axis=axis, dtype=np.float64)
--> 762 if not np.all(np.isclose(out.take(-1, axis=axis), expected, rtol=rto
l,
out.take = <built-in method take of numpy.ndarray object>
axis = None
expected = 0.0
rtol = 1e-05
atol = 1e-08
763 atol=atol, equal_nan=True)):
764 warnings.warn('cumsum was found to be unstable: '
765 'its last element does not correspond to sum',
766 RuntimeWarning)
IndexError: cannot do a non-empty take from an empty axes.
Issue Analytics
- State:
- Created 6 years ago
- Comments:14 (10 by maintainers)
Top Results From Across the Web
Python error cannot do a non empty take from an empty axes
I think here is problem row has all NaN s values in 2 to 63 columns and x = x.dropna return empty Series...
Read more >Python error cannot do a non empty take from an empty axes
Pandas : Python error cannot do a non empty take from an empty axes [ Beautify Your Computer : https://www.hows.tech/p/recommended.html ] ...
Read more >Generalized Huber Regression
In this post we present a generalized version of the Huber loss ... This loss function can not be transformed to a single...
Read more >What is the population minimizer for Huber loss
If we use mean squared error loss (MSE), the population minimizer is E[Y|X]. If instead we use mean absolute error as our loss...
Read more >A Robust Boosting Algorithm for Chemical Modeling
regression it is natural to use a loss function that correlates positively with the size of the absolute residual. For instance, the square...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
They say that RandomForests and friends don’t need data normalisation, but they forget to mention that you have to be careful with near infinite values.
This will provoke the error 9 times out of 10.
As far as I can see, the only way a GradientBoostingRegressor can have an empty leaf is if the DecisionTreeRegressor returns a tree with an empty leaf, and sure enough running the following code often reveals the presence of several empty leaves.
Possible solutions
Sure!