question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

cross_val_score issue with n_jobs = -1 on Windows

See original GitHub issue

Description

The error is thrown when utilizing n_jobs = -1 with the function: cross_val_score. If I use n_jobs = 1, it works fine.

Steps/Code to Reproduce

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

dataset = pd.read_csv('example.csv')
X = dataset.iloc[:, 3:13].values    
y = dataset.iloc[:, 13].values      

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])  
labelencoder_X_2 = LabelEncoder()
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])   

onehotencoder = OneHotEncoder(categorical_features = [1])
X = onehotencoder.fit_transform(X).toarray()
X = X[:, 1:]

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from keras.models import Sequential
from keras.layers import Dense


def build_classifier():
    classifier = Sequential()
    classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu', input_dim = 11))
    classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu'))
    classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
    classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
    return classifier
classifier = KerasClassifier(build_fn = build_classifier, batch_size = 10, epochs = 100)
accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)

Expected Results

Expect my example to run multiple epochs at a time.

Actual Results

Error:

accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)
Traceback (most recent call last):

  File "<ipython-input-4-cc51c2d2980a>", line 1, in <module>
    accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 402, in cross_val_score
    error_score=error_score)

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 240, in cross_validate
    for train, test in cv.split(X, y, groups))

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 930, in __call__
    self.retrieve()

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 521, in wrap_future_result
    return future.result(timeout=timeout)

  File "C:\Users\Richar\Anaconda3\lib\concurrent\futures\_base.py", line 432, in result
    return self.__get_result()

  File "C:\Users\Richar\Anaconda3\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception

BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

I think the problem is related to Windows, Keras and joblib

Versions

I’m using:

  • 64-bit Windows 10
  • Python 3.7.1
  • scikit-learn 0.20.1
  • Spyder 3.3.3

Thanks for the help.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:4
  • Comments:45 (13 by maintainers)

github_iconTop GitHub Comments

5reactions
jnothmancommented, Feb 24, 2019

Put build_classifier in a separate module and import it

4reactions
hanss01commented, Jan 3, 2020

so you all come from deep learning a-z right? XD

Read more comments on GitHub >

github_iconTop Results From Across the Web

cross_val_score on Windows10, error with parallel-computing
The code worked if I set n_jobs=1. Is there a way to resolve this problem? Added: The code works on linux virtual machine....
Read more >
[Scikit-learn-general] cross_val_score crashes python every time
problems when setting cross_val_score's built in n_jobs > 1. I have ... hits cross_val_score. I'm running this on Windows 7 with Anaconda and...
Read more >
Gradient Boosting with Scikit-Learn, XGBoost, LightGBM, and ...
Library Installation; Test Problems; Gradient Boosting ... n_scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1, ...
Read more >
Parallelisation of Model Evaluation and Hyperparameter ...
Model evaluation in sci-kit learn can be achieved by the cross_val_score. This performs repeated stratified K fold resampling and assess the ...
Read more >
Python Machine Learning - Code Examples
However, some Windows users reported issues when running the previous code with the n_jobs=-1 setting related to pickling the tokenizer and ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found