Running RandomizedGridSearchCV results in inability to perform further tasks in multiprocessing
See original GitHub issueDescribe the bug
After having run RandomizedSearchCV, subsequent functions execution in a multiprocessing (concurrent.futures.ProcessPoolExecutor) mode freeze.
If running any function in multiprocessing mode before RandomizedSearchCV, everything works fine.
I have tried reducing the n_jobs in RandomizedSearchCV to 1, still subsequent multiprocessing processes freeze.
I have also tried to change the default
joblib.parallel_backend('loky')
to
joblib.parallel_backend('multiprocessing')
joblib.parallel_backend('threading')
…didn’t help.
Interesting note: the problem does not reproduce if a subsequent function is very simple (just print some string). But when any complexity is added to the function, it fails to run after RandomizedSearchCV. Looks like RandomizedSearchCV does not release workers or triggers some other processes.
Here is a small video to illustrate the problem:
https://vimeo.com/user50681456/review/474733642/b712c12c2c
Steps/Code to Reproduce
from xgboost import XGBRegressor
from sklearn.model_selection import KFold
import concurrent.futures
from sklearn.datasets import make_regression
import pandas as pd
import numpy as np
from sklearn.model_selection import RandomizedSearchCV
# DEFINE FUNCTIONS
def simple_func():
from sklearn.datasets import make_regression
# JUST CREATING A DATASET, NOT EVEN FITTING ANY MODEL!!! AND IT FREEZES
data = make_regression(n_samples=500, n_features=100, n_informative=10, n_targets=1, random_state=5)
print('Fit complete')
def just_print():
print('Just printing')
def run_randomized_search_cv():
data = make_regression(n_samples=500, n_features=100, n_informative=10, n_targets=1, random_state=5)
X = pd.DataFrame(data[0])
y = pd.Series(data[1])
kf = KFold(n_splits = 3, shuffle = True, random_state = 5)
model = XGBRegressor()
params = {
'min_child_weight': [0.1, 1, 5],
'subsample': [0.5, 0.7, 1.0],
'colsample_bytree': [0.5, 0.7, 1.0],
'eta': [0.005, 0.01, 0.1]
}
random_search = RandomizedSearchCV(
model,
param_distributions = params,
n_iter = 25,
n_jobs = -1,
refit = True, # necessary for random_search.best_estimator_
cv = kf.split(X,y),
verbose = 1,
random_state = 5
)
random_search.fit(X, np.array(y))
# STEP 0
# test multiprocessing with concurrent.futures and a simple function
with concurrent.futures.ProcessPoolExecutor() as executor:
results_temp = [executor.submit(simple_func) for i in range(0,12)]
# ----------------------------------------------------------------------------
# STEP 1
# simulate RandomizedSearchCV
run_randomized_search_cv()
# ----------------------------------------------------------------------------
# STEP 2.0
# test if multiprocessing on a function that just prints
with concurrent.futures.ProcessPoolExecutor() as executor:
results_temp = [executor.submit(just_print) for i in range(0,12)]
# ----------------------------------------------------------------------------
# STEP 3
# test the function from STEP 0
with concurrent.futures.ProcessPoolExecutor() as executor:
results_temp = [executor.submit(simple_func) for i in range(0,12)]
# ----------------------------------------------------------------------------
Expected Results
Last call to a function in a multiprocessing mode prints ‘Fit complete’ 12 times.
Actual Results
Last call to a function in a multiprocessing mode freezes.
Versions
System: python: 3.7.6 (default, Jan 8 2020, 13:42:34) [Clang 4.0.1 (tags/RELEASE_401/final)] executable: /Users/danil/anaconda3/bin/python machine: Darwin-19.6.0-x86_64-i386-64bit
Python dependencies: pip: 20.2.3 setuptools: 46.0.0.post20200309 sklearn: 0.22.1 numpy: 1.18.1 scipy: 1.4.1 Cython: 0.29.15 pandas: 1.0.5 matplotlib: 3.3.2 joblib: 0.17.0
Built with OpenMP: True Darwin-19.6.0-x86_64-i386-64bit Python 3.7.6 (default, Jan 8 2020, 13:42:34) [Clang 4.0.1 (tags/RELEASE_401/final)] NumPy 1.18.1 SciPy 1.4.1 Scikit-Learn 0.22.1
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (1 by maintainers)
Thanks @DanilZherebtsov for reaching out and thanks @rworreby for investigating the issue, this was really helpful. If I understand correctly this is not a bug in scikit-learn: I’m going to close this issue, feel free to reopen if there is still something to solve.
Hi rworreby, thanks for looking into this issue!
Could you please explain what does your comment “Built with OpenMP: True” mean and how do I check my current status on this setting?
P.S. I have managed to solve the problem by inserting in the beginning of my program:
as this is explained here: https://scikit-learn.org/stable/faq.html#why-do-i-sometime-get-a-crash-freeze-with-n-jobs-1-under-osx-or-linux
But the shell looses some level of interactivity as the results intermediate results don’t get printed as the program is executed.
P.S.S. my setuptools version is ‘46.0.0.post20200309’