Really slow tests on aarch64
See original GitHub issuelogs_conda-forge_scikit-learn-feedstock_2_2.log Man, I can understand if these will be “won’t fix” but when building on drone, some tests timeout after 5 mins, but only sometimes.
They are the following
- test_least_absolute_deviation
- test_missing_values_minmax_imputation
- test_warm_start_clear
- test_recursion_decision_function
- test_estimators
- test_early_stopping_regression
- test_missing_values_resilience
- test_warm_start_yields_identical_results
Logs
________________________ test_least_absolute_deviation _________________________
[gw7] linux -- Python 3.7.3 $PREFIX/bin/python
def test_least_absolute_deviation():
# For coverage only.
X, y = make_regression(n_samples=500, random_state=0)
gbdt = HistGradientBoostingRegressor(loss='least_absolute_deviation',
random_state=0)
> gbdt.fit(X, y)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/tests/test_gradient_boosting.py:163:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:319: in fit
grower.grow()
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/grower.py:252: in grow
self.split_next()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <sklearn.ensemble._hist_gradient_boosting.grower.TreeGrower object at 0xffff8ace4470>
def split_next(self):
"""Split the node with highest potential gain.
Returns
-------
left : TreeNode
The resulting left child.
right : TreeNode
The resulting right child.
"""
# Consider the node with the highest loss reduction (a.k.a. gain)
node = heappop(self.splittable_nodes)
tic = time()
(sample_indices_left,
sample_indices_right,
right_child_pos) = self.splitter.split_indices(node.split_info,
> node.sample_indices)
E Failed: Timeout >300.0s
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/grower.py:320: Failed
____________________ test_missing_values_minmax_imputation _____________________
[gw5] linux -- Python 3.7.3 $PREFIX/bin/python
def test_missing_values_minmax_imputation():
# Compare the buit-in missing value handling of Histogram GBC with an
# a-priori missing value imputation strategy that should yield the same
# results in terms of decision function.
#
# Each feature (containing NaNs) is replaced by 2 features:
# - one where the nans are replaced by min(feature) - 1
# - one where the nans are replaced by max(feature) + 1
# A split where nans go to the left has an equivalent split in the
# first (min) feature, and a split where nans go to the right has an
# equivalent split in the second (max) feature.
#
# Assuming the data is such that there is never a tie to select the best
# feature to split on during training, the learned decision trees should be
# strictly equivalent (learn a sequence of splits that encode the same
# decision function).
#
# The MinMaxImputer transformer is meant to be a toy implementation of the
# "Missing In Attributes" (MIA) missing value handling for decision trees
# https://www.sciencedirect.com/science/article/abs/pii/S0167865508000305
# The implementation of MIA as an imputation transformer was suggested by
# "Remark 3" in https://arxiv.org/abs/1902.06931
class MinMaxImputer(BaseEstimator, TransformerMixin):
def fit(self, X, y=None):
mm = MinMaxScaler().fit(X)
self.data_min_ = mm.data_min_
self.data_max_ = mm.data_max_
return self
def transform(self, X):
X_min, X_max = X.copy(), X.copy()
for feature_idx in range(X.shape[1]):
nan_mask = np.isnan(X[:, feature_idx])
X_min[nan_mask, feature_idx] = self.data_min_[feature_idx] - 1
X_max[nan_mask, feature_idx] = self.data_max_[feature_idx] + 1
return np.concatenate([X_min, X_max], axis=1)
def make_missing_value_data(n_samples=int(1e4), seed=0):
rng = np.random.RandomState(seed)
X, y = make_regression(n_samples=n_samples, n_features=4,
random_state=rng)
# Pre-bin the data to ensure a deterministic handling by the 2
# strategies and also make it easier to insert np.nan in a structured
# way:
X = KBinsDiscretizer(n_bins=42, encode="ordinal").fit_transform(X)
# First feature has missing values completely at random:
rnd_mask = rng.rand(X.shape[0]) > 0.9
X[rnd_mask, 0] = np.nan
# Second and third features have missing values for extreme values
# (censoring missingness):
low_mask = X[:, 1] == 0
X[low_mask, 1] = np.nan
high_mask = X[:, 2] == X[:, 2].max()
X[high_mask, 2] = np.nan
# Make the last feature nan pattern very informative:
y_max = np.percentile(y, 70)
y_max_mask = y >= y_max
y[y_max_mask] = y_max
X[y_max_mask, 3] = np.nan
# Check that there is at least one missing value in each feature:
for feature_idx in range(X.shape[1]):
assert any(np.isnan(X[:, feature_idx]))
# Let's use a test set to check that the learned decision function is
# the same as evaluated on unseen data. Otherwise it could just be the
# case that we find two independent ways to overfit the training set.
return train_test_split(X, y, random_state=rng)
# n_samples need to be large enough to minimize the likelihood of having
# several candidate splits with the same gain value in a given tree.
X_train, X_test, y_train, y_test = make_missing_value_data(
n_samples=int(1e4), seed=0)
# Use a small number of leaf nodes and iterations so as to keep
# under-fitting models to minimize the likelihood of ties when training the
# model.
gbm1 = HistGradientBoostingRegressor(max_iter=100,
max_leaf_nodes=5,
random_state=0)
gbm1.fit(X_train, y_train)
gbm2 = make_pipeline(MinMaxImputer(), clone(gbm1))
> gbm2.fit(X_train, y_train)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/tests/test_gradient_boosting.py:386:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/pipeline.py:352: in fit
self._final_estimator.fit(Xt, y, **fit_params)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:301: in fit
y_train, raw_predictions)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <sklearn.ensemble._hist_gradient_boosting.loss.LeastSquares object at 0xffff9580a828>
gradients = array([ 4.6511135, 5.4659395, -74.38958 , ..., 28.82124 ,
3.4787138, 6.115484 ], dtype=float32)
hessians = array([[1.]], dtype=float32)
y_true = array([-123.57738749, 74.6151337 , 53.44932862, ..., -66.93498585,
74.6151337 , 74.6151337 ])
raw_predictions = array([-118.92627394, 80.081073 , -20.94025014, ..., -38.11374701,
78.09384736, 80.73061789])
def update_gradients_and_hessians(self, gradients, hessians, y_true,
raw_predictions):
# shape (1, n_samples) --> (n_samples,). reshape(-1) is more likely to
# return a view.
raw_predictions = raw_predictions.reshape(-1)
gradients = gradients.reshape(-1)
> _update_gradients_least_squares(gradients, y_true, raw_predictions)
E Failed: Timeout >300.0s
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/loss.py:153: Failed
__________ test_warm_start_clear[HistGradientBoostingRegressor-X1-y1] __________
[gw1] linux -- Python 3.7.3 $PREFIX/bin/python
GradientBoosting = <class 'sklearn.ensemble._hist_gradient_boosting.gradient_boosting.HistGradientBoostingRegressor'>
X = array([[-1.5415874 , 0.22739278, -1.35338886, ..., 0.86727663,
0.40520408, -0.06964158],
[-1.0212540...985, -0.67832984],
[ 1.83708069, -0.00296475, 1.80144921, ..., 0.11230769,
0.33109242, 1.51848293]])
y = array([ 5.89981499e+01, -1.51472301e+02, 3.92774675e+00, 1.30275835e+02,
2.04728060e+00, 1.01587138e+02, -1...4499e+02, 1.17344962e+02, 5.81688314e+01,
-7.85896400e+01, 1.59876618e+01, -2.50470392e+02, -2.19074129e+02])
@pytest.mark.parametrize('GradientBoosting, X, y', [
(HistGradientBoostingClassifier, X_classification, y_classification),
(HistGradientBoostingRegressor, X_regression, y_regression)
])
def test_warm_start_clear(GradientBoosting, X, y):
# Test if fit clears state.
gb_1 = GradientBoosting(n_iter_no_change=5, random_state=42)
gb_1.fit(X, y)
gb_2 = GradientBoosting(n_iter_no_change=5, random_state=42,
warm_start=True)
gb_2.fit(X, y) # inits state
gb_2.set_params(warm_start=False)
> gb_2.fit(X, y) # clears old state and equals est
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/tests/test_warm_start.py:143:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:362: in fit
X_binned_val, y_val,
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:443: in _check_early_stopping_scorer
self.scorer_(self, X_binned_val, y_val)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/metrics/_scorer.py:371: in _passthrough_scorer
return estimator.score(*args, **kwargs)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/base.py:422: in score
y_pred = self.predict(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:813: in predict
return self._raw_predict(X).ravel()
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:594: in _raw_predict
raw_predictions[k, :] += predict(X)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <sklearn.ensemble._hist_gradient_boosting.predictor.TreePredictor object at 0xffffad5fdeb8>
X = array([[49, 61, 6, 31, 62, 42, 3, 3, 80, 6, 27, 33, 89, 71, 20, 86,
59, 19, 86, 13, 79, 59, 56, 45, 47, 60... 57, 56,
36, 54, 49, 47, 75, 72, 46, 68, 14, 8, 74, 23, 44, 30, 74, 74,
87, 75, 9, 78]], dtype=uint8)
missing_values_bin_idx = 255
def predict_binned(self, X, missing_values_bin_idx):
"""Predict raw values for binned data.
Parameters
----------
X : ndarray, shape (n_samples, n_features)
The input samples.
missing_values_bin_idx : uint8
Index of the bin that is used for missing values. This is the
index of the last bin and is always equal to max_bins (as passed
to the GBDT classes), or equivalently to n_bins - 1.
Returns
-------
y : ndarray, shape (n_samples,)
The raw predicted values.
"""
out = np.empty(X.shape[0], dtype=Y_DTYPE)
> _predict_from_binned_data(self.nodes, X, missing_values_bin_idx, out)
E Failed: Timeout >300.0s
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/predictor.py:68: Failed
___________________ test_recursion_decision_function[0-est1] ___________________
[gw5] linux -- Python 3.7.3 $PREFIX/bin/python
est = HistGradientBoostingClassifier(l2_regularization=0.0, learning_rate=0.1,
loss='auto', m...07,
validation_fraction=0.1, verbose=0,
warm_start=False)
target_feature = 0
@pytest.mark.parametrize('est', (
GradientBoostingClassifier(random_state=0),
HistGradientBoostingClassifier(random_state=0),
))
@pytest.mark.parametrize('target_feature', (0, 1, 2, 3, 4, 5))
def test_recursion_decision_function(est, target_feature):
# Make sure the recursion method (implicitly uses decision_function) has
# the same result as using brute method with
# response_method=decision_function
X, y = make_classification(n_classes=2, n_clusters_per_class=1,
random_state=1)
assert np.mean(y) == .5 # make sure the init estimator predicts 0 anyway
est.fit(X, y)
preds_1, _ = partial_dependence(est, X, [target_feature],
response_method='decision_function',
method='recursion')
preds_2, _ = partial_dependence(est, X, [target_feature],
response_method='decision_function',
> method='brute')
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/tests/test_partial_dependence.py:230:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:401: in partial_dependence
estimator, grid, features_indices, X, response_method
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:147: in _partial_dependence_brute
predictions = prediction_method(X_eval)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:1030: in decision_function
decision = self._raw_predict(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:594: in _raw_predict
raw_predictions[k, :] += predict(X)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <sklearn.ensemble._hist_gradient_boosting.predictor.TreePredictor object at 0xffff94286c18>
X = array([[ 0.51731301, 0.26686086, 1.6169496 , ..., 0.19915097,
-1.23685338, 1.24328724],
[ 0.5173130...673, 1.17899425],
[ 0.51731301, 2.27449013, 1.37333246, ..., 0.71939066,
-2.17071106, -1.6845077 ]])
def predict(self, X):
"""Predict raw values for non-binned data.
Parameters
----------
X : ndarray, shape (n_samples, n_features)
The input samples.
Returns
-------
y : ndarray, shape (n_samples,)
The raw predicted values.
"""
out = np.empty(X.shape[0], dtype=Y_DTYPE)
> _predict_from_numeric_data(self.nodes, X, out)
E Failed: Timeout >300.0s
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/predictor.py:47: Failed
----------------------------- Captured stderr call -----------------------------
___________________ test_recursion_decision_function[1-est1] ___________________
[gw7] linux -- Python 3.7.3 $PREFIX/bin/python
est = HistGradientBoostingClassifier(l2_regularization=0.0, learning_rate=0.1,
loss='auto', m...07,
validation_fraction=0.1, verbose=0,
warm_start=False)
target_feature = 1
@pytest.mark.parametrize('est', (
GradientBoostingClassifier(random_state=0),
HistGradientBoostingClassifier(random_state=0),
))
@pytest.mark.parametrize('target_feature', (0, 1, 2, 3, 4, 5))
def test_recursion_decision_function(est, target_feature):
# Make sure the recursion method (implicitly uses decision_function) has
# the same result as using brute method with
# response_method=decision_function
X, y = make_classification(n_classes=2, n_clusters_per_class=1,
random_state=1)
assert np.mean(y) == .5 # make sure the init estimator predicts 0 anyway
est.fit(X, y)
preds_1, _ = partial_dependence(est, X, [target_feature],
response_method='decision_function',
method='recursion')
preds_2, _ = partial_dependence(est, X, [target_feature],
response_method='decision_function',
> method='brute')
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/tests/test_partial_dependence.py:230:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:401: in partial_dependence
estimator, grid, features_indices, X, response_method
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:147: in _partial_dependence_brute
predictions = prediction_method(X_eval)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:1030: in decision_function
decision = self._raw_predict(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:594: in _raw_predict
raw_predictions[k, :] += predict(X)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <sklearn.ensemble._hist_gradient_boosting.predictor.TreePredictor object at 0xffff88236f60>
X = array([[-0.58652394, 0.88056426, 1.6169496 , ..., 0.19915097,
-1.23685338, 1.24328724],
[ 0.1855356...673, 1.17899425],
[ 0.94926809, 0.88056426, 1.37333246, ..., 0.71939066,
-2.17071106, -1.6845077 ]])
def predict(self, X):
"""Predict raw values for non-binned data.
Parameters
----------
X : ndarray, shape (n_samples, n_features)
The input samples.
Returns
-------
y : ndarray, shape (n_samples,)
The raw predicted values.
"""
out = np.empty(X.shape[0], dtype=Y_DTYPE)
> _predict_from_numeric_data(self.nodes, X, out)
E Failed: Timeout >300.0s
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/predictor.py:47: Failed
___________________ test_recursion_decision_function[4-est1] ___________________
[gw5] linux -- Python 3.7.3 $PREFIX/bin/python
est = HistGradientBoostingClassifier(l2_regularization=0.0, learning_rate=0.1,
loss='auto', m...07,
validation_fraction=0.1, verbose=0,
warm_start=False)
target_feature = 4
@pytest.mark.parametrize('est', (
GradientBoostingClassifier(random_state=0),
HistGradientBoostingClassifier(random_state=0),
))
@pytest.mark.parametrize('target_feature', (0, 1, 2, 3, 4, 5))
def test_recursion_decision_function(est, target_feature):
# Make sure the recursion method (implicitly uses decision_function) has
# the same result as using brute method with
# response_method=decision_function
X, y = make_classification(n_classes=2, n_clusters_per_class=1,
random_state=1)
assert np.mean(y) == .5 # make sure the init estimator predicts 0 anyway
est.fit(X, y)
preds_1, _ = partial_dependence(est, X, [target_feature],
response_method='decision_function',
method='recursion')
preds_2, _ = partial_dependence(est, X, [target_feature],
response_method='decision_function',
> method='brute')
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/tests/test_partial_dependence.py:230:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:401: in partial_dependence
estimator, grid, features_indices, X, response_method
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:147: in _partial_dependence_brute
predictions = prediction_method(X_eval)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:1030: in decision_function
decision = self._raw_predict(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:594: in _raw_predict
raw_predictions[k, :] += predict(X)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <sklearn.ensemble._hist_gradient_boosting.predictor.TreePredictor object at 0xffff942860b8>
X = array([[-0.58652394, 0.26686086, 1.6169496 , ..., 0.19915097,
-1.23685338, 1.24328724],
[ 0.1855356...673, 1.17899425],
[ 0.94926809, 2.27449013, 1.37333246, ..., 0.71939066,
-2.17071106, -1.6845077 ]])
def predict(self, X):
"""Predict raw values for non-binned data.
Parameters
----------
X : ndarray, shape (n_samples, n_features)
The input samples.
Returns
-------
y : ndarray, shape (n_samples,)
The raw predicted values.
"""
out = np.empty(X.shape[0], dtype=Y_DTYPE)
> _predict_from_numeric_data(self.nodes, X, out)
E Failed: Timeout >300.0s
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/predictor.py:47: Failed
___________________ test_recursion_decision_function[3-est1] ___________________
[gw3] linux -- Python 3.7.3 $PREFIX/bin/python
est = HistGradientBoostingClassifier(l2_regularization=0.0, learning_rate=0.1,
loss='auto', m...07,
validation_fraction=0.1, verbose=0,
warm_start=False)
target_feature = 3
@pytest.mark.parametrize('est', (
GradientBoostingClassifier(random_state=0),
HistGradientBoostingClassifier(random_state=0),
))
@pytest.mark.parametrize('target_feature', (0, 1, 2, 3, 4, 5))
def test_recursion_decision_function(est, target_feature):
# Make sure the recursion method (implicitly uses decision_function) has
# the same result as using brute method with
# response_method=decision_function
X, y = make_classification(n_classes=2, n_clusters_per_class=1,
random_state=1)
assert np.mean(y) == .5 # make sure the init estimator predicts 0 anyway
est.fit(X, y)
preds_1, _ = partial_dependence(est, X, [target_feature],
response_method='decision_function',
method='recursion')
preds_2, _ = partial_dependence(est, X, [target_feature],
response_method='decision_function',
> method='brute')
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/tests/test_partial_dependence.py:230:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:401: in partial_dependence
estimator, grid, features_indices, X, response_method
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:147: in _partial_dependence_brute
predictions = prediction_method(X_eval)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:1030: in decision_function
decision = self._raw_predict(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:594: in _raw_predict
raw_predictions[k, :] += predict(X)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <sklearn.ensemble._hist_gradient_boosting.predictor.TreePredictor object at 0xffff96ebf9e8>
X = array([[-0.58652394, 0.26686086, 1.6169496 , ..., 0.19915097,
-1.23685338, 1.24328724],
[ 0.1855356...673, 1.17899425],
[ 0.94926809, 2.27449013, 1.37333246, ..., 0.71939066,
-2.17071106, -1.6845077 ]])
def predict(self, X):
"""Predict raw values for non-binned data.
Parameters
----------
X : ndarray, shape (n_samples, n_features)
The input samples.
Returns
-------
y : ndarray, shape (n_samples,)
The raw predicted values.
"""
out = np.empty(X.shape[0], dtype=Y_DTYPE)
> _predict_from_numeric_data(self.nodes, X, out)
E Failed: Timeout >300.0s
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/predictor.py:47: Failed
_______________ test_estimators[TSNE()-check_estimators_dtypes] ________________
[gw2] linux -- Python 3.7.3 $PREFIX/bin/python
estimator = TSNE(angle=0.5, early_exaggeration=12.0, init='random', learning_rate=200.0,
method='barnes_hut', metric='euclide...omponents=2, n_iter=1000, n_iter_without_progress=300, n_jobs=None,
perplexity=30.0, random_state=None, verbose=0)
check = functools.partial(<function check_estimators_dtypes at 0xffffa3d99950>, 'TSNE')
@parametrize_with_checks(_tested_estimators())
def test_estimators(estimator, check):
# Common tests for estimator instances
with ignore_warnings(category=(FutureWarning,
ConvergenceWarning,
UserWarning, FutureWarning)):
_set_checking_parameters(estimator)
> check(estimator)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/tests/test_common.py:98:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/utils/_testing.py:327: in wrapper
return fn(*args, **kwargs)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/utils/estimator_checks.py:1338: in check_estimators_dtypes
estimator.fit(X_train, y)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/manifold/_t_sne.py:904: in fit
self.fit_transform(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/manifold/_t_sne.py:886: in fit_transform
embedding = self._fit(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/manifold/_t_sne.py:798: in _fit
skip_num_points=skip_num_points)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/manifold/_t_sne.py:852: in _tsne
**opt_args)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/manifold/_t_sne.py:358: in _gradient_descent
error, grad = objective(p, *args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
params = array([-155.24246 , -50.656303 , 56.69138 , 169.73177 ,
-110.365 , -132.27484 , 77.20271 , -149.4...546 , -28.501383 , 15.549845 ,
-189.69225 , 40.249886 , 127.85508 , -69.97335 ],
dtype=float32)
P = <20x20 sparse matrix of type '<class 'numpy.float64'>'
with 380 stored elements in Compressed Sparse Row format>
degrees_of_freedom = 1, n_samples = 20, n_components = 2, angle = 0.5
skip_num_points = 0, verbose = 0, compute_error = False, num_threads = 96
def _kl_divergence_bh(params, P, degrees_of_freedom, n_samples, n_components,
angle=0.5, skip_num_points=0, verbose=False,
compute_error=True, num_threads=1):
"""t-SNE objective function: KL divergence of p_ijs and q_ijs.
Uses Barnes-Hut tree methods to calculate the gradient that
runs in O(NlogN) instead of O(N^2)
Parameters
----------
params : array, shape (n_params,)
Unraveled embedding.
P : csr sparse matrix, shape (n_samples, n_sample)
Sparse approximate joint probability matrix, computed only for the
k nearest-neighbors and symmetrized.
degrees_of_freedom : int
Degrees of freedom of the Student's-t distribution.
n_samples : int
Number of samples.
n_components : int
Dimension of the embedded space.
angle : float (default: 0.5)
This is the trade-off between speed and accuracy for Barnes-Hut T-SNE.
'angle' is the angular size (referred to as theta in [3]) of a distant
node as measured from a point. If this size is below 'angle' then it is
used as a summary node of all points contained within it.
This method is not very sensitive to changes in this parameter
in the range of 0.2 - 0.8. Angle less than 0.2 has quickly increasing
computation time and angle greater 0.8 has quickly increasing error.
skip_num_points : int (optional, default:0)
This does not compute the gradient for points with indices below
`skip_num_points`. This is useful when computing transforms of new
data where you'd like to keep the old data fixed.
verbose : int
Verbosity level.
compute_error: bool (optional, default:True)
If False, the kl_divergence is not computed and returns NaN.
num_threads : int (optional, default:1)
Number of threads used to compute the gradient. This is set here to
avoid calling _openmp_effective_n_threads for each gradient step.
Returns
-------
kl_divergence : float
Kullback-Leibler divergence of p_ij and q_ij.
grad : array, shape (n_params,)
Unraveled gradient of the Kullback-Leibler divergence with respect to
the embedding.
"""
params = params.astype(np.float32, copy=False)
X_embedded = params.reshape(n_samples, n_components)
val_P = P.data.astype(np.float32, copy=False)
neighbors = P.indices.astype(np.int64, copy=False)
indptr = P.indptr.astype(np.int64, copy=False)
grad = np.zeros(X_embedded.shape, dtype=np.float32)
error = _barnes_hut_tsne.gradient(val_P, X_embedded, neighbors, indptr,
grad, angle, n_components, verbose,
dof=degrees_of_freedom,
compute_error=compute_error,
> num_threads=num_threads)
E Failed: Timeout >300.0s
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/manifold/_t_sne.py:262: Failed
----------------------------- Captured stderr call -----------------------------
___________________ test_recursion_decision_function[5-est1] ___________________
[gw7] linux -- Python 3.7.3 $PREFIX/bin/python
est = HistGradientBoostingClassifier(l2_regularization=0.0, learning_rate=0.1,
loss='auto', m...07,
validation_fraction=0.1, verbose=0,
warm_start=False)
target_feature = 5
@pytest.mark.parametrize('est', (
GradientBoostingClassifier(random_state=0),
HistGradientBoostingClassifier(random_state=0),
))
@pytest.mark.parametrize('target_feature', (0, 1, 2, 3, 4, 5))
def test_recursion_decision_function(est, target_feature):
# Make sure the recursion method (implicitly uses decision_function) has
# the same result as using brute method with
# response_method=decision_function
X, y = make_classification(n_classes=2, n_clusters_per_class=1,
random_state=1)
assert np.mean(y) == .5 # make sure the init estimator predicts 0 anyway
est.fit(X, y)
preds_1, _ = partial_dependence(est, X, [target_feature],
response_method='decision_function',
method='recursion')
preds_2, _ = partial_dependence(est, X, [target_feature],
response_method='decision_function',
> method='brute')
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/tests/test_partial_dependence.py:230:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:401: in partial_dependence
estimator, grid, features_indices, X, response_method
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:147: in _partial_dependence_brute
predictions = prediction_method(X_eval)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:1030: in decision_function
decision = self._raw_predict(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:594: in _raw_predict
raw_predictions[k, :] += predict(X)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <sklearn.ensemble._hist_gradient_boosting.predictor.TreePredictor object at 0xffff880f7160>
X = array([[-0.58652394, 0.26686086, 1.6169496 , ..., 0.19915097,
-1.23685338, 1.24328724],
[ 0.1855356...673, 1.17899425],
[ 0.94926809, 2.27449013, 1.37333246, ..., 0.71939066,
-2.17071106, -1.6845077 ]])
def predict(self, X):
"""Predict raw values for non-binned data.
Parameters
----------
X : ndarray, shape (n_samples, n_features)
The input samples.
Returns
-------
y : ndarray, shape (n_samples,)
The raw predicted values.
"""
out = np.empty(X.shape[0], dtype=Y_DTYPE)
> _predict_from_numeric_data(self.nodes, X, out)
E Failed: Timeout >300.0s
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/predictor.py:47: Failed
Issue Analytics
- State:
- Created 4 years ago
- Comments:10 (6 by maintainers)
Top Results From Across the Web
Code compiled for arm64 much slower than for x86_64
I am using Clang on macOS / arm64. I encountered a situation where the same code runs three times slower when compiled for...
Read more >Re: [PATCH] tests/acceptance/boot_linux: Skip slow Aarch64 'virt ...
The BootLinuxAarch64.test_virt_tcg is reported to take >7min to run. Add a possibility to users to skip this particular test, by setting the ...
Read more >Re: [PATCH] tests/acceptance/boot_linux: Skip slow Aarch64 'virt ...
Subject: Re: [PATCH] tests/acceptance/boot_linux: Skip slow Aarch64 'virt' machine TCG test. Date: Thu, 07 May 2020 21:32:41 +0100. User-agent: mu4e 1.4.4; ...
Read more >Testing - DynamoRIO
The Actions integrated output viewer is slow ; it is often better to use the right-hand ... We use Jenkins for pre- and...
Read more >How fast (or slow) is a LDR cache HIT compared to other ARM ...
I suppose the answer also depends on the precise hardware being used. And that one should simply write test cases and measure the...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Thanks, actually setting OMP_NUM_THREADS really helped things. Thanks for that pointer!
Thanks for the report and for trying to build aarch64 and ppc on conda-forge.
A fair amount of these use
HistGradientBoostingClassifier
.HistGradientBoostingClassifier
uses OpenMP and if you are usingpytest-xdist
you should definitely restrict the number of threads per worker in tests withOMP_NUM_THREADS
(currently onlyOPENBLAS_NUM_THREADS
is set in the recipe I think). OtherwiseHistGradientBoostingClassifier
can be very slow in case of CPU oversubscription particularly when used together withpytest-xdist
https://github.com/scikit-learn/scikit-learn/issues/15078 we haven’t really found the root cause of it.