question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Really slow tests on aarch64

See original GitHub issue

logs_conda-forge_scikit-learn-feedstock_2_2.log Man, I can understand if these will be “won’t fix” but when building on drone, some tests timeout after 5 mins, but only sometimes.

They are the following

  • test_least_absolute_deviation
  • test_missing_values_minmax_imputation
  • test_warm_start_clear
  • test_recursion_decision_function
  • test_estimators
  • test_early_stopping_regression
  • test_missing_values_resilience
  • test_warm_start_yields_identical_results
Logs
________________________ test_least_absolute_deviation _________________________
[gw7] linux -- Python 3.7.3 $PREFIX/bin/python

    def test_least_absolute_deviation():
        # For coverage only.
        X, y = make_regression(n_samples=500, random_state=0)
        gbdt = HistGradientBoostingRegressor(loss='least_absolute_deviation',
                                             random_state=0)
>       gbdt.fit(X, y)

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/tests/test_gradient_boosting.py:163: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:319: in fit
    grower.grow()
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/grower.py:252: in grow
    self.split_next()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <sklearn.ensemble._hist_gradient_boosting.grower.TreeGrower object at 0xffff8ace4470>

    def split_next(self):
        """Split the node with highest potential gain.
    
        Returns
        -------
        left : TreeNode
            The resulting left child.
        right : TreeNode
            The resulting right child.
        """
        # Consider the node with the highest loss reduction (a.k.a. gain)
        node = heappop(self.splittable_nodes)
    
        tic = time()
        (sample_indices_left,
         sample_indices_right,
         right_child_pos) = self.splitter.split_indices(node.split_info,
>                                                       node.sample_indices)
E       Failed: Timeout >300.0s

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/grower.py:320: Failed
____________________ test_missing_values_minmax_imputation _____________________
[gw5] linux -- Python 3.7.3 $PREFIX/bin/python

    def test_missing_values_minmax_imputation():
        # Compare the buit-in missing value handling of Histogram GBC with an
        # a-priori missing value imputation strategy that should yield the same
        # results in terms of decision function.
        #
        # Each feature (containing NaNs) is replaced by 2 features:
        # - one where the nans are replaced by min(feature) - 1
        # - one where the nans are replaced by max(feature) + 1
        # A split where nans go to the left has an equivalent split in the
        # first (min) feature, and a split where nans go to the right has an
        # equivalent split in the second (max) feature.
        #
        # Assuming the data is such that there is never a tie to select the best
        # feature to split on during training, the learned decision trees should be
        # strictly equivalent (learn a sequence of splits that encode the same
        # decision function).
        #
        # The MinMaxImputer transformer is meant to be a toy implementation of the
        # "Missing In Attributes" (MIA) missing value handling for decision trees
        # https://www.sciencedirect.com/science/article/abs/pii/S0167865508000305
        # The implementation of MIA as an imputation transformer was suggested by
        # "Remark 3" in https://arxiv.org/abs/1902.06931
    
        class MinMaxImputer(BaseEstimator, TransformerMixin):
    
            def fit(self, X, y=None):
                mm = MinMaxScaler().fit(X)
                self.data_min_ = mm.data_min_
                self.data_max_ = mm.data_max_
                return self
    
            def transform(self, X):
                X_min, X_max = X.copy(), X.copy()
    
                for feature_idx in range(X.shape[1]):
                    nan_mask = np.isnan(X[:, feature_idx])
                    X_min[nan_mask, feature_idx] = self.data_min_[feature_idx] - 1
                    X_max[nan_mask, feature_idx] = self.data_max_[feature_idx] + 1
    
                return np.concatenate([X_min, X_max], axis=1)
    
        def make_missing_value_data(n_samples=int(1e4), seed=0):
            rng = np.random.RandomState(seed)
            X, y = make_regression(n_samples=n_samples, n_features=4,
                                   random_state=rng)
    
            # Pre-bin the data to ensure a deterministic handling by the 2
            # strategies and also make it easier to insert np.nan in a structured
            # way:
            X = KBinsDiscretizer(n_bins=42, encode="ordinal").fit_transform(X)
    
            # First feature has missing values completely at random:
            rnd_mask = rng.rand(X.shape[0]) > 0.9
            X[rnd_mask, 0] = np.nan
    
            # Second and third features have missing values for extreme values
            # (censoring missingness):
            low_mask = X[:, 1] == 0
            X[low_mask, 1] = np.nan
    
            high_mask = X[:, 2] == X[:, 2].max()
            X[high_mask, 2] = np.nan
    
            # Make the last feature nan pattern very informative:
            y_max = np.percentile(y, 70)
            y_max_mask = y >= y_max
            y[y_max_mask] = y_max
            X[y_max_mask, 3] = np.nan
    
            # Check that there is at least one missing value in each feature:
            for feature_idx in range(X.shape[1]):
                assert any(np.isnan(X[:, feature_idx]))
    
            # Let's use a test set to check that the learned decision function is
            # the same as evaluated on unseen data. Otherwise it could just be the
            # case that we find two independent ways to overfit the training set.
            return train_test_split(X, y, random_state=rng)
    
        # n_samples need to be large enough to minimize the likelihood of having
        # several candidate splits with the same gain value in a given tree.
        X_train, X_test, y_train, y_test = make_missing_value_data(
            n_samples=int(1e4), seed=0)
    
        # Use a small number of leaf nodes and iterations so as to keep
        # under-fitting models to minimize the likelihood of ties when training the
        # model.
        gbm1 = HistGradientBoostingRegressor(max_iter=100,
                                             max_leaf_nodes=5,
                                             random_state=0)
        gbm1.fit(X_train, y_train)
    
        gbm2 = make_pipeline(MinMaxImputer(), clone(gbm1))
>       gbm2.fit(X_train, y_train)

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/tests/test_gradient_boosting.py:386: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/pipeline.py:352: in fit
    self._final_estimator.fit(Xt, y, **fit_params)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:301: in fit
    y_train, raw_predictions)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <sklearn.ensemble._hist_gradient_boosting.loss.LeastSquares object at 0xffff9580a828>
gradients = array([  4.6511135,   5.4659395, -74.38958  , ...,  28.82124  ,
         3.4787138,   6.115484 ], dtype=float32)
hessians = array([[1.]], dtype=float32)
y_true = array([-123.57738749,   74.6151337 ,   53.44932862, ...,  -66.93498585,
         74.6151337 ,   74.6151337 ])
raw_predictions = array([-118.92627394,   80.081073  ,  -20.94025014, ...,  -38.11374701,
         78.09384736,   80.73061789])

    def update_gradients_and_hessians(self, gradients, hessians, y_true,
                                      raw_predictions):
        # shape (1, n_samples) --> (n_samples,). reshape(-1) is more likely to
        # return a view.
        raw_predictions = raw_predictions.reshape(-1)
        gradients = gradients.reshape(-1)
>       _update_gradients_least_squares(gradients, y_true, raw_predictions)
E       Failed: Timeout >300.0s

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/loss.py:153: Failed
__________ test_warm_start_clear[HistGradientBoostingRegressor-X1-y1] __________
[gw1] linux -- Python 3.7.3 $PREFIX/bin/python

GradientBoosting = <class 'sklearn.ensemble._hist_gradient_boosting.gradient_boosting.HistGradientBoostingRegressor'>
X = array([[-1.5415874 ,  0.22739278, -1.35338886, ...,  0.86727663,
         0.40520408, -0.06964158],
       [-1.0212540...985, -0.67832984],
       [ 1.83708069, -0.00296475,  1.80144921, ...,  0.11230769,
         0.33109242,  1.51848293]])
y = array([ 5.89981499e+01, -1.51472301e+02,  3.92774675e+00,  1.30275835e+02,
        2.04728060e+00,  1.01587138e+02, -1...4499e+02,  1.17344962e+02,  5.81688314e+01,
       -7.85896400e+01,  1.59876618e+01, -2.50470392e+02, -2.19074129e+02])

    @pytest.mark.parametrize('GradientBoosting, X, y', [
        (HistGradientBoostingClassifier, X_classification, y_classification),
        (HistGradientBoostingRegressor, X_regression, y_regression)
    ])
    def test_warm_start_clear(GradientBoosting, X, y):
        # Test if fit clears state.
        gb_1 = GradientBoosting(n_iter_no_change=5, random_state=42)
        gb_1.fit(X, y)
    
        gb_2 = GradientBoosting(n_iter_no_change=5, random_state=42,
                                warm_start=True)
        gb_2.fit(X, y)  # inits state
        gb_2.set_params(warm_start=False)
>       gb_2.fit(X, y)  # clears old state and equals est

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/tests/test_warm_start.py:143: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:362: in fit
    X_binned_val, y_val,
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:443: in _check_early_stopping_scorer
    self.scorer_(self, X_binned_val, y_val)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/metrics/_scorer.py:371: in _passthrough_scorer
    return estimator.score(*args, **kwargs)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/base.py:422: in score
    y_pred = self.predict(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:813: in predict
    return self._raw_predict(X).ravel()
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:594: in _raw_predict
    raw_predictions[k, :] += predict(X)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <sklearn.ensemble._hist_gradient_boosting.predictor.TreePredictor object at 0xffffad5fdeb8>
X = array([[49, 61,  6, 31, 62, 42,  3,  3, 80,  6, 27, 33, 89, 71, 20, 86,
        59, 19, 86, 13, 79, 59, 56, 45, 47, 60... 57, 56,
        36, 54, 49, 47, 75, 72, 46, 68, 14,  8, 74, 23, 44, 30, 74, 74,
        87, 75,  9, 78]], dtype=uint8)
missing_values_bin_idx = 255

    def predict_binned(self, X, missing_values_bin_idx):
        """Predict raw values for binned data.
    
        Parameters
        ----------
        X : ndarray, shape (n_samples, n_features)
            The input samples.
        missing_values_bin_idx : uint8
            Index of the bin that is used for missing values. This is the
            index of the last bin and is always equal to max_bins (as passed
            to the GBDT classes), or equivalently to n_bins - 1.
    
        Returns
        -------
        y : ndarray, shape (n_samples,)
            The raw predicted values.
        """
        out = np.empty(X.shape[0], dtype=Y_DTYPE)
>       _predict_from_binned_data(self.nodes, X, missing_values_bin_idx, out)
E       Failed: Timeout >300.0s

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/predictor.py:68: Failed
___________________ test_recursion_decision_function[0-est1] ___________________
[gw5] linux -- Python 3.7.3 $PREFIX/bin/python

est = HistGradientBoostingClassifier(l2_regularization=0.0, learning_rate=0.1,
                               loss='auto', m...07,
                               validation_fraction=0.1, verbose=0,
                               warm_start=False)
target_feature = 0

    @pytest.mark.parametrize('est', (
        GradientBoostingClassifier(random_state=0),
        HistGradientBoostingClassifier(random_state=0),
    ))
    @pytest.mark.parametrize('target_feature', (0, 1, 2, 3, 4, 5))
    def test_recursion_decision_function(est, target_feature):
        # Make sure the recursion method (implicitly uses decision_function) has
        # the same result as using brute method with
        # response_method=decision_function
    
        X, y = make_classification(n_classes=2, n_clusters_per_class=1,
                                   random_state=1)
        assert np.mean(y) == .5  # make sure the init estimator predicts 0 anyway
    
        est.fit(X, y)
    
        preds_1, _ = partial_dependence(est, X, [target_feature],
                                        response_method='decision_function',
                                        method='recursion')
        preds_2, _ = partial_dependence(est, X, [target_feature],
                                        response_method='decision_function',
>                                       method='brute')

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/tests/test_partial_dependence.py:230: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:401: in partial_dependence
    estimator, grid, features_indices, X, response_method
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:147: in _partial_dependence_brute
    predictions = prediction_method(X_eval)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:1030: in decision_function
    decision = self._raw_predict(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:594: in _raw_predict
    raw_predictions[k, :] += predict(X)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <sklearn.ensemble._hist_gradient_boosting.predictor.TreePredictor object at 0xffff94286c18>
X = array([[ 0.51731301,  0.26686086,  1.6169496 , ...,  0.19915097,
        -1.23685338,  1.24328724],
       [ 0.5173130...673,  1.17899425],
       [ 0.51731301,  2.27449013,  1.37333246, ...,  0.71939066,
        -2.17071106, -1.6845077 ]])

    def predict(self, X):
        """Predict raw values for non-binned data.
    
        Parameters
        ----------
        X : ndarray, shape (n_samples, n_features)
            The input samples.
    
        Returns
        -------
        y : ndarray, shape (n_samples,)
            The raw predicted values.
        """
        out = np.empty(X.shape[0], dtype=Y_DTYPE)
>       _predict_from_numeric_data(self.nodes, X, out)
E       Failed: Timeout >300.0s

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/predictor.py:47: Failed
----------------------------- Captured stderr call -----------------------------
___________________ test_recursion_decision_function[1-est1] ___________________
[gw7] linux -- Python 3.7.3 $PREFIX/bin/python

est = HistGradientBoostingClassifier(l2_regularization=0.0, learning_rate=0.1,
                               loss='auto', m...07,
                               validation_fraction=0.1, verbose=0,
                               warm_start=False)
target_feature = 1

    @pytest.mark.parametrize('est', (
        GradientBoostingClassifier(random_state=0),
        HistGradientBoostingClassifier(random_state=0),
    ))
    @pytest.mark.parametrize('target_feature', (0, 1, 2, 3, 4, 5))
    def test_recursion_decision_function(est, target_feature):
        # Make sure the recursion method (implicitly uses decision_function) has
        # the same result as using brute method with
        # response_method=decision_function
    
        X, y = make_classification(n_classes=2, n_clusters_per_class=1,
                                   random_state=1)
        assert np.mean(y) == .5  # make sure the init estimator predicts 0 anyway
    
        est.fit(X, y)
    
        preds_1, _ = partial_dependence(est, X, [target_feature],
                                        response_method='decision_function',
                                        method='recursion')
        preds_2, _ = partial_dependence(est, X, [target_feature],
                                        response_method='decision_function',
>                                       method='brute')

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/tests/test_partial_dependence.py:230: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:401: in partial_dependence
    estimator, grid, features_indices, X, response_method
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:147: in _partial_dependence_brute
    predictions = prediction_method(X_eval)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:1030: in decision_function
    decision = self._raw_predict(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:594: in _raw_predict
    raw_predictions[k, :] += predict(X)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <sklearn.ensemble._hist_gradient_boosting.predictor.TreePredictor object at 0xffff88236f60>
X = array([[-0.58652394,  0.88056426,  1.6169496 , ...,  0.19915097,
        -1.23685338,  1.24328724],
       [ 0.1855356...673,  1.17899425],
       [ 0.94926809,  0.88056426,  1.37333246, ...,  0.71939066,
        -2.17071106, -1.6845077 ]])

    def predict(self, X):
        """Predict raw values for non-binned data.
    
        Parameters
        ----------
        X : ndarray, shape (n_samples, n_features)
            The input samples.
    
        Returns
        -------
        y : ndarray, shape (n_samples,)
            The raw predicted values.
        """
        out = np.empty(X.shape[0], dtype=Y_DTYPE)
>       _predict_from_numeric_data(self.nodes, X, out)
E       Failed: Timeout >300.0s

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/predictor.py:47: Failed
___________________ test_recursion_decision_function[4-est1] ___________________
[gw5] linux -- Python 3.7.3 $PREFIX/bin/python

est = HistGradientBoostingClassifier(l2_regularization=0.0, learning_rate=0.1,
                               loss='auto', m...07,
                               validation_fraction=0.1, verbose=0,
                               warm_start=False)
target_feature = 4

    @pytest.mark.parametrize('est', (
        GradientBoostingClassifier(random_state=0),
        HistGradientBoostingClassifier(random_state=0),
    ))
    @pytest.mark.parametrize('target_feature', (0, 1, 2, 3, 4, 5))
    def test_recursion_decision_function(est, target_feature):
        # Make sure the recursion method (implicitly uses decision_function) has
        # the same result as using brute method with
        # response_method=decision_function
    
        X, y = make_classification(n_classes=2, n_clusters_per_class=1,
                                   random_state=1)
        assert np.mean(y) == .5  # make sure the init estimator predicts 0 anyway
    
        est.fit(X, y)
    
        preds_1, _ = partial_dependence(est, X, [target_feature],
                                        response_method='decision_function',
                                        method='recursion')
        preds_2, _ = partial_dependence(est, X, [target_feature],
                                        response_method='decision_function',
>                                       method='brute')

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/tests/test_partial_dependence.py:230: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:401: in partial_dependence
    estimator, grid, features_indices, X, response_method
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:147: in _partial_dependence_brute
    predictions = prediction_method(X_eval)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:1030: in decision_function
    decision = self._raw_predict(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:594: in _raw_predict
    raw_predictions[k, :] += predict(X)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <sklearn.ensemble._hist_gradient_boosting.predictor.TreePredictor object at 0xffff942860b8>
X = array([[-0.58652394,  0.26686086,  1.6169496 , ...,  0.19915097,
        -1.23685338,  1.24328724],
       [ 0.1855356...673,  1.17899425],
       [ 0.94926809,  2.27449013,  1.37333246, ...,  0.71939066,
        -2.17071106, -1.6845077 ]])

    def predict(self, X):
        """Predict raw values for non-binned data.
    
        Parameters
        ----------
        X : ndarray, shape (n_samples, n_features)
            The input samples.
    
        Returns
        -------
        y : ndarray, shape (n_samples,)
            The raw predicted values.
        """
        out = np.empty(X.shape[0], dtype=Y_DTYPE)
>       _predict_from_numeric_data(self.nodes, X, out)
E       Failed: Timeout >300.0s

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/predictor.py:47: Failed
___________________ test_recursion_decision_function[3-est1] ___________________
[gw3] linux -- Python 3.7.3 $PREFIX/bin/python

est = HistGradientBoostingClassifier(l2_regularization=0.0, learning_rate=0.1,
                               loss='auto', m...07,
                               validation_fraction=0.1, verbose=0,
                               warm_start=False)
target_feature = 3

    @pytest.mark.parametrize('est', (
        GradientBoostingClassifier(random_state=0),
        HistGradientBoostingClassifier(random_state=0),
    ))
    @pytest.mark.parametrize('target_feature', (0, 1, 2, 3, 4, 5))
    def test_recursion_decision_function(est, target_feature):
        # Make sure the recursion method (implicitly uses decision_function) has
        # the same result as using brute method with
        # response_method=decision_function
    
        X, y = make_classification(n_classes=2, n_clusters_per_class=1,
                                   random_state=1)
        assert np.mean(y) == .5  # make sure the init estimator predicts 0 anyway
    
        est.fit(X, y)
    
        preds_1, _ = partial_dependence(est, X, [target_feature],
                                        response_method='decision_function',
                                        method='recursion')
        preds_2, _ = partial_dependence(est, X, [target_feature],
                                        response_method='decision_function',
>                                       method='brute')

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/tests/test_partial_dependence.py:230: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:401: in partial_dependence
    estimator, grid, features_indices, X, response_method
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:147: in _partial_dependence_brute
    predictions = prediction_method(X_eval)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:1030: in decision_function
    decision = self._raw_predict(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:594: in _raw_predict
    raw_predictions[k, :] += predict(X)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <sklearn.ensemble._hist_gradient_boosting.predictor.TreePredictor object at 0xffff96ebf9e8>
X = array([[-0.58652394,  0.26686086,  1.6169496 , ...,  0.19915097,
        -1.23685338,  1.24328724],
       [ 0.1855356...673,  1.17899425],
       [ 0.94926809,  2.27449013,  1.37333246, ...,  0.71939066,
        -2.17071106, -1.6845077 ]])

    def predict(self, X):
        """Predict raw values for non-binned data.
    
        Parameters
        ----------
        X : ndarray, shape (n_samples, n_features)
            The input samples.
    
        Returns
        -------
        y : ndarray, shape (n_samples,)
            The raw predicted values.
        """
        out = np.empty(X.shape[0], dtype=Y_DTYPE)
>       _predict_from_numeric_data(self.nodes, X, out)
E       Failed: Timeout >300.0s

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/predictor.py:47: Failed
_______________ test_estimators[TSNE()-check_estimators_dtypes] ________________
[gw2] linux -- Python 3.7.3 $PREFIX/bin/python

estimator = TSNE(angle=0.5, early_exaggeration=12.0, init='random', learning_rate=200.0,
     method='barnes_hut', metric='euclide...omponents=2, n_iter=1000, n_iter_without_progress=300, n_jobs=None,
     perplexity=30.0, random_state=None, verbose=0)
check = functools.partial(<function check_estimators_dtypes at 0xffffa3d99950>, 'TSNE')

    @parametrize_with_checks(_tested_estimators())
    def test_estimators(estimator, check):
        # Common tests for estimator instances
        with ignore_warnings(category=(FutureWarning,
                                       ConvergenceWarning,
                                       UserWarning, FutureWarning)):
            _set_checking_parameters(estimator)
>           check(estimator)

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/tests/test_common.py:98: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/utils/_testing.py:327: in wrapper
    return fn(*args, **kwargs)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/utils/estimator_checks.py:1338: in check_estimators_dtypes
    estimator.fit(X_train, y)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/manifold/_t_sne.py:904: in fit
    self.fit_transform(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/manifold/_t_sne.py:886: in fit_transform
    embedding = self._fit(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/manifold/_t_sne.py:798: in _fit
    skip_num_points=skip_num_points)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/manifold/_t_sne.py:852: in _tsne
    **opt_args)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/manifold/_t_sne.py:358: in _gradient_descent
    error, grad = objective(p, *args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

params = array([-155.24246  ,  -50.656303 ,   56.69138  ,  169.73177  ,
       -110.365    , -132.27484  ,   77.20271  , -149.4...546  ,  -28.501383 ,   15.549845 ,
       -189.69225  ,   40.249886 ,  127.85508  ,  -69.97335  ],
      dtype=float32)
P = <20x20 sparse matrix of type '<class 'numpy.float64'>'
	with 380 stored elements in Compressed Sparse Row format>
degrees_of_freedom = 1, n_samples = 20, n_components = 2, angle = 0.5
skip_num_points = 0, verbose = 0, compute_error = False, num_threads = 96

    def _kl_divergence_bh(params, P, degrees_of_freedom, n_samples, n_components,
                          angle=0.5, skip_num_points=0, verbose=False,
                          compute_error=True, num_threads=1):
        """t-SNE objective function: KL divergence of p_ijs and q_ijs.
    
        Uses Barnes-Hut tree methods to calculate the gradient that
        runs in O(NlogN) instead of O(N^2)
    
        Parameters
        ----------
        params : array, shape (n_params,)
            Unraveled embedding.
    
        P : csr sparse matrix, shape (n_samples, n_sample)
            Sparse approximate joint probability matrix, computed only for the
            k nearest-neighbors and symmetrized.
    
        degrees_of_freedom : int
            Degrees of freedom of the Student's-t distribution.
    
        n_samples : int
            Number of samples.
    
        n_components : int
            Dimension of the embedded space.
    
        angle : float (default: 0.5)
            This is the trade-off between speed and accuracy for Barnes-Hut T-SNE.
            'angle' is the angular size (referred to as theta in [3]) of a distant
            node as measured from a point. If this size is below 'angle' then it is
            used as a summary node of all points contained within it.
            This method is not very sensitive to changes in this parameter
            in the range of 0.2 - 0.8. Angle less than 0.2 has quickly increasing
            computation time and angle greater 0.8 has quickly increasing error.
    
        skip_num_points : int (optional, default:0)
            This does not compute the gradient for points with indices below
            `skip_num_points`. This is useful when computing transforms of new
            data where you'd like to keep the old data fixed.
    
        verbose : int
            Verbosity level.
    
        compute_error: bool (optional, default:True)
            If False, the kl_divergence is not computed and returns NaN.
    
        num_threads : int (optional, default:1)
            Number of threads used to compute the gradient. This is set here to
            avoid calling _openmp_effective_n_threads for each gradient step.
    
        Returns
        -------
        kl_divergence : float
            Kullback-Leibler divergence of p_ij and q_ij.
    
        grad : array, shape (n_params,)
            Unraveled gradient of the Kullback-Leibler divergence with respect to
            the embedding.
        """
        params = params.astype(np.float32, copy=False)
        X_embedded = params.reshape(n_samples, n_components)
    
        val_P = P.data.astype(np.float32, copy=False)
        neighbors = P.indices.astype(np.int64, copy=False)
        indptr = P.indptr.astype(np.int64, copy=False)
    
        grad = np.zeros(X_embedded.shape, dtype=np.float32)
        error = _barnes_hut_tsne.gradient(val_P, X_embedded, neighbors, indptr,
                                          grad, angle, n_components, verbose,
                                          dof=degrees_of_freedom,
                                          compute_error=compute_error,
>                                         num_threads=num_threads)
E       Failed: Timeout >300.0s

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/manifold/_t_sne.py:262: Failed
----------------------------- Captured stderr call -----------------------------
___________________ test_recursion_decision_function[5-est1] ___________________
[gw7] linux -- Python 3.7.3 $PREFIX/bin/python

est = HistGradientBoostingClassifier(l2_regularization=0.0, learning_rate=0.1,
                               loss='auto', m...07,
                               validation_fraction=0.1, verbose=0,
                               warm_start=False)
target_feature = 5

    @pytest.mark.parametrize('est', (
        GradientBoostingClassifier(random_state=0),
        HistGradientBoostingClassifier(random_state=0),
    ))
    @pytest.mark.parametrize('target_feature', (0, 1, 2, 3, 4, 5))
    def test_recursion_decision_function(est, target_feature):
        # Make sure the recursion method (implicitly uses decision_function) has
        # the same result as using brute method with
        # response_method=decision_function
    
        X, y = make_classification(n_classes=2, n_clusters_per_class=1,
                                   random_state=1)
        assert np.mean(y) == .5  # make sure the init estimator predicts 0 anyway
    
        est.fit(X, y)
    
        preds_1, _ = partial_dependence(est, X, [target_feature],
                                        response_method='decision_function',
                                        method='recursion')
        preds_2, _ = partial_dependence(est, X, [target_feature],
                                        response_method='decision_function',
>                                       method='brute')

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/tests/test_partial_dependence.py:230: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:401: in partial_dependence
    estimator, grid, features_indices, X, response_method
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/inspection/_partial_dependence.py:147: in _partial_dependence_brute
    predictions = prediction_method(X_eval)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:1030: in decision_function
    decision = self._raw_predict(X)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py:594: in _raw_predict
    raw_predictions[k, :] += predict(X)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <sklearn.ensemble._hist_gradient_boosting.predictor.TreePredictor object at 0xffff880f7160>
X = array([[-0.58652394,  0.26686086,  1.6169496 , ...,  0.19915097,
        -1.23685338,  1.24328724],
       [ 0.1855356...673,  1.17899425],
       [ 0.94926809,  2.27449013,  1.37333246, ...,  0.71939066,
        -2.17071106, -1.6845077 ]])

    def predict(self, X):
        """Predict raw values for non-binned data.
    
        Parameters
        ----------
        X : ndarray, shape (n_samples, n_features)
            The input samples.
    
        Returns
        -------
        y : ndarray, shape (n_samples,)
            The raw predicted values.
        """
        out = np.empty(X.shape[0], dtype=Y_DTYPE)
>       _predict_from_numeric_data(self.nodes, X, out)
E       Failed: Timeout >300.0s

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/predictor.py:47: Failed

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:10 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
hmaarrfkcommented, Dec 9, 2019

Thanks, actually setting OMP_NUM_THREADS really helped things. Thanks for that pointer!

1reaction
rthcommented, Dec 9, 2019

Thanks for the report and for trying to build aarch64 and ppc on conda-forge.

A fair amount of these use HistGradientBoostingClassifier.

HistGradientBoostingClassifier uses OpenMP and if you are using pytest-xdist you should definitely restrict the number of threads per worker in tests with OMP_NUM_THREADS (currently only OPENBLAS_NUM_THREADS is set in the recipe I think). Otherwise HistGradientBoostingClassifier can be very slow in case of CPU oversubscription particularly when used together with pytest-xdist https://github.com/scikit-learn/scikit-learn/issues/15078 we haven’t really found the root cause of it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Code compiled for arm64 much slower than for x86_64
I am using Clang on macOS / arm64. I encountered a situation where the same code runs three times slower when compiled for...
Read more >
Re: [PATCH] tests/acceptance/boot_linux: Skip slow Aarch64 'virt ...
The BootLinuxAarch64.test_virt_tcg is reported to take >7min to run. Add a possibility to users to skip this particular test, by setting the ...
Read more >
Re: [PATCH] tests/acceptance/boot_linux: Skip slow Aarch64 'virt ...
Subject: Re: [PATCH] tests/acceptance/boot_linux: Skip slow Aarch64 'virt' machine TCG test. Date: Thu, 07 May 2020 21:32:41 +0100. User-agent: mu4e 1.4.4; ...
Read more >
Testing - DynamoRIO
The Actions integrated output viewer is slow ; it is often better to use the right-hand ... We use Jenkins for pre- and...
Read more >
How fast (or slow) is a LDR cache HIT compared to other ARM ...
I suppose the answer also depends on the precise hardware being used. And that one should simply write test cases and measure the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found