Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Infinite loop bug in Gridsearch CV Svm.SVC(), Windows 10

See original GitHub issue

This is going to be a bit shorter as I cannot determine with 100% accuracy what steps are causing this bug.

Setup: Windows 10 most recent update Anaconda most recent update Jupyter Notebook and Prompt latest update (issue tested in both; no warnings, errors, bugs, etc are printed in either) All packages (scikit, numpy, etc.) latest update AMD FX 8350 8-core Nvidia GeForce GTX 980 16 GB ram

The issue: When running with n_jobs set to -1, my grid_search_wrapper runs fine when calculating MLPClassifier() and takes up ~70% of CPU processing power. The jobs (192 x 10 crossvalidation = 1920) run in about 8 minutes and returns the expected dataframe of results.

When running with the clf set to a SVM machine, the process always starts up and prints out:

Fitting X folds for each of Y candidates, totalling (sic) X*Y fits

After this, my computer sits for hours without any progress. Killing the kernel does not halt the ~10-15 spawned processes. When n_jobs is set to -1 killing python through task manager ends the CPU usage. When n_jobs = 1, my CPU usage is only ~20% (I believe only one core is being utillized) but no python processes are spawned in task manager. I have to restart my computer to stop the single core calculation.

Note that training individual models without passing them through the gridsearch function succeeds. I have not tested all combinations by hands, but I have tested all the individual kernels. Training an SVM model took, on average, 1-2 minutes when running singularly.

Here are the following variations of inputs and the resultant output of the grid_search_wrapper function:

with n_jobs = -1:
ml_params = {
    'kernel': ['linear', 'poly', 'rbf', 'sigmoid'],
    'degree': [2,3,4],
    tol': [1e-3, 1e-4, 1e-2]
}

FAIL

ml_params = {
    'kernel': ['linear', 'rbf', 'sigmoid'],
}

PASS

ml_params = {
    'kernel': ['linear', 'rbf', 'sigmoid'],
    tol': [1e-3, 1e-4, 1e-2]
}

FAIL

ml_params = {
    'kernel': ['linear']
}

Pass

ml_params = {
    'kernel': ['rbf']
}

Pass

ml_params = {
    'kernel': ['sigmoid']
}

Pass

ml_params = {
    'kernel': ['poly']
}

FAIL

with n_jobs = 1:

ml_params = {
    'kernel': ['linear', 'poly', 'rbf', 'sigmoid'],
    'degree': [2,3,4],
    tol': [1e-3, 1e-4, 1e-2]
}

FAIL

ml_params = {
    'kernel': ['linear', 'rbf', 'sigmoid'],
}

FAIL

ml_params = {
    'kernel': ['linear', 'rbf', 'sigmoid'],
    tol': [1e-3, 1e-4, 1e-2]
}

FAIL


ml_params = {
    'kernel': ['linear']
}

Pass


ml_params = {
    'kernel': ['rbf']
}

Pass


ml_params = {
    'kernel': ['sigmoid']
}

Pass


ml_params = {
    'kernel': ['poly']
}

FAIL

Note that k_folds was set to 3 instead of 10 as was used when training MLPClassifier in order to help facilitate figuring out what was happening with SVM. I think this setting is irrelevant to the problem.

Data set: 15000 instances x 90 predictors (relatively small, memory usage is about 2 GB during SVM runs)

Code dump:

def grid_search_wrapper(clf, param_grid, scoring, X_train, X_test, y_train, y_test, refit_score='accuracy_score'):
    #https://towardsdatascience.com/fine-tuning-a-classifier-in-scikit-learn-66e048c21e65
    from sklearn.model_selection import GridSearchCV
    from sklearn.model_selection import StratifiedKFold
    """
    fits a GridSearchCV classifier using refit_score for optimization
    prints classifier performance metrics
    """
    
    skf = StratifiedKFold(n_splits=3)
    grid_search = GridSearchCV(clf, param_grid, cv = skf, scoring=scorers, refit=refit_score, return_train_score=True, n_jobs=1, verbose = 1)
    grid_search.fit(X_train, y_train)

    # make the predictions
    y_pred = grid_search.predict(X_test)

    print('Best params for {}'.format(refit_score))
    print(grid_search.best_params_)

    # confusion matrix on the test data.
    print('\nConfusion matrix of Model optimized for {} on the test data:'.format(refit_score))
    print(pd.DataFrame(confusion_matrix(y_test, y_pred),
                 columns=['pred_neg', 'pred_pos'], index=['neg', 'pos']))
    return grid_search

#ignore my shitty code and not importing at the top
from sklearn.metrics import roc_curve, precision_recall_curve, auc, make_scorer, recall_score, accuracy_score, precision_score, confusion_matrix

#gridsearch for optimal MLPs
#ml_params = {
#    'activation': ['relu', 'tanh', 'logistic'],
#    'alpha': [1e-3, 1e-4, 1e-5, 1e-6],
#    'hidden_layer_sizes': [[100,25,], [50,50,], [75,25,25], [50,25,10]],
#    'max_iter': [100, 500, 1000, 2500]    
#}

ml_params = {
    #    'kernel': ['linear', 'poly', 'rbf', 'sigmoid'],
    #poly kernel is problematic
    'kernel': ['linear', 'rbf', 'sigmoid'],
    #'degree': [2,3,4],
    'tol': [1e-3, 1e-4, 1e-2]

#SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
#    decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf',
#    max_iter=-1, probability=False, random_state=None, shrinking=True,
#    tol=0.001, verbose=False)    
}
scorers = {
    'precision_score': make_scorer(precision_score),
    'recall_score': make_scorer(recall_score),
    'accuracy_score': make_scorer(accuracy_score)
}
grid_search_clf = grid_search_wrapper(refit_score = 'recall_score', param_grid = ml_params, scoring = scorers, X_train = x_train, X_test = x_test, y_train = y_train, y_test = y_test, clf = svm.SVC())


results = pd.DataFrame(grid_search_clf.cv_results_)
results = results.sort_values(by='mean_test_recall_score', ascending=False) 

#for MLP
#results[['mean_test_precision_score', 'mean_test_accuracy_score', 'mean_test_recall_score', 'param_activation', 'param_alpha', 'param_hidden_layer_sizes', 'param_max_iter']]

#for svm
results[['mean_test_precision_score', 'mean_test_accuracy_score', 'mean_test_recall_score', 'param_kernel', 'param_tol']]