Segfault in HistGradientBoostingClassifier
See original GitHub issueDescribe the bug
I trigger a segfault in HistGradientBoostingClassifier
. ~~I could trigger during cross-validation with n_jobs=-1
and n_jobs=1
.~~Actually, I am not able to trigger anymore in n_jobs=1
but it was the case before (on a case without a random_state
set.
I am using both missing values and categorical features management at the same time. I don’t know if it could be one of the issue.
Steps/Code to Reproduce
# %%
import pandas as pd
target_name = "RainTomorrow"
data = pd.read_csv("./weather.csv", parse_dates=["Date"])
data = data.dropna(axis="index", subset=[target_name])
X, y = data.drop(columns=["Date", target_name]), data[target_name]
# %%
X.info()
# %%
from sklearn.preprocessing import OrdinalEncoder
from sklearn.compose import make_column_transformer, make_column_selector
categorical_columns = make_column_selector(dtype_include=object)(X)
preprocessing = make_column_transformer(
(
OrdinalEncoder(handle_unknown="use_encoded_value", unknown_value=-1),
categorical_columns,
),
remainder="passthrough",
)
# %%
from sklearn.pipeline import make_pipeline
from sklearn.ensemble import HistGradientBoostingClassifier
model = make_pipeline(
preprocessing,
HistGradientBoostingClassifier(
categorical_features=range(len(categorical_columns)),
random_state=0,
),
)
# %%
from sklearn.model_selection import cross_validate
cross_validate(model, X, y, n_jobs=-1)
I am also attaching the dataset that I used to trigger the problem.
I tried to reproduce with a random set with both categorical and missing values but it did segfault.
Expected Results
At least it should not segfault.
Actual Results
---------------------------------------------------------------------------
TerminatedWorkerError Traceback (most recent call last)
~/Documents/scratch/bug_hist_gradient_boosting.py in <module>
40 from sklearn.model_selection import cross_validate
41
----> 42 cross_validate(model, X, y, n_jobs=-1)
~/Documents/packages/scikit-learn/sklearn/model_selection/_validation.py in cross_validate(estimator, X, y, groups, scoring, cv, n_jobs, verbose, fit_params, pre_dispatch, return_train_score, return_estimator, error_score)
265 # independent, and that it is pickle-able.
266 parallel = Parallel(n_jobs=n_jobs, verbose=verbose, pre_dispatch=pre_dispatch)
--> 267 results = parallel(
268 delayed(_fit_and_score)(
269 clone(estimator),
~/Documents/packages/joblib/joblib/parallel.py in __call__(self, iterable)
1052
1053 with self._backend.retrieval_context():
-> 1054 self.retrieve()
1055 # Make sure that we get a last message telling us we are done
1056 elapsed_time = time.time() - self._start_time
~/Documents/packages/joblib/joblib/parallel.py in retrieve(self)
931 try:
932 if getattr(self._backend, 'supports_timeout', False):
--> 933 self._output.extend(job.get(timeout=self.timeout))
934 else:
935 self._output.extend(job.get())
~/Documents/packages/joblib/joblib/_parallel_backends.py in wrap_future_result(future, timeout)
540 AsyncResults.get from multiprocessing."""
541 try:
--> 542 return future.result(timeout=timeout)
543 except CfTimeoutError as e:
544 raise TimeoutError from e
~/mambaforge/envs/dev/lib/python3.8/concurrent/futures/_base.py in result(self, timeout)
442 raise CancelledError()
443 elif self._state == FINISHED:
--> 444 return self.__get_result()
445 else:
446 raise TimeoutError()
~/mambaforge/envs/dev/lib/python3.8/concurrent/futures/_base.py in __get_result(self)
387 if self._exception:
388 try:
--> 389 raise self._exception
390 finally:
391 # Break a reference cycle with the exception in self._exception
TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.
The exit codes of the workers are {SIGSEGV(-11)}
Versions
System:
python: 3.8.12 | packaged by conda-forge | (default, Sep 16 2021, 01:38:21) [Clang 11.1.0 ]
executable: /Users/glemaitre/mambaforge/envs/dev/bin/python
machine: macOS-11.6-arm64-arm-64bit
Python dependencies:
pip: 21.2.4
setuptools: 58.2.0
sklearn: 1.1.dev0
numpy: 1.21.2
scipy: 1.7.1
Cython: 0.29.24
pandas: 1.3.3
matplotlib: 3.4.3
joblib: 1.0.1
threadpoolctl: 3.0.0
Built with OpenMP: True
Issue Analytics
- State:
- Created 2 years ago
- Comments:13 (12 by maintainers)
Top Results From Across the Web
Histogram GBDT can segfault if categorical contains negative ...
Histogram GBDT can segfault if they contain negative categories. Indeed, we documented that they should all be positive in [0, max_bins] but ...
Read more >8.3. Parallelism, resource management, and configuration
Each instance of HistGradientBoostingClassifier will spawn 8 threads (since you have 8 CPUs). ... This is useful for finding segfaults.
Read more >Segmentation fault while importing sklearn - Stack Overflow
When I try to import scikit-learn in python, I get a segmentation fault >>>import sklearn as sk Segmentation fault (core dumped).
Read more >tests/test_docstring_parameters.py · alkaline-ml/scikit-learn - Gemfury
... reason='test segfaults on PyPy') def test_docstring_parameters(): # Test module docstring formatting # Skip test if numpydoc is not found ...
Read more >Democratizing Machine Learning: Perspective from a scikit ...
... backend manages a pool of Python VMs segfault resilient lazy loop ... HistGradientBoostingClassifier()) scores = cross_val_score(model, ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Tomorrow is Friday. It could be a nide day to release 😃
On Thu, 14 Oct 2021 at 19:49, Olivier Grisel @.***> wrote:
– Guillaume Lemaitre Scikit-learn @ Inria Foundation https://glemaitre.github.io/
I think we can consider that #21227 will fix it in 1.0.1.