question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

n_jobs in GridSearchCV issue (again in 0.18.1)

See original GitHub issue

Hey, thanks for an awesome library!

There’s a bit of an issue (seemingly the same as #6147) with GridSearchCV, which seems to be in 0.18.1 as well as having been present in previous versions. I get the issue when using GridSearchCV with the ElasticNet classifier:

vals = [10 ** i for i in range(0, -10, -1)] + list(np.arange(0.1, 0.999, 0.025))
parameters = {'l1_ratio': vals, 'alpha': vals}
elastic = ElasticNet(max_iter = 1000)
clf = GridSearchCV(elastic, param_grid = parameters, n_jobs = 8, scoring = 'accuracy', cv = 10)
clf.fit(X_train, y_train)

Actual Results

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 344, in __call__
    return self.func(*args, **kwargs)
  File "/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 131, in __call__
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 131, in <listcomp>
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 260, in _fit_and_score
    test_score = _score(estimator, X_test, y_test, scorer)
  File "/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 288, in _score
    score = scorer(estimator, X_test, y_test)
  File "/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/metrics/scorer.py", line 98, in __call__
    **self._kwargs)
  File "/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/metrics/classification.py", line 172, in accuracy_score
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)
  File "/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/metrics/classification.py", line 82, in _check_targets
    "".format(type_true, type_pred))
ValueError: Can't handle mix of binary and continuous

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/zeerak/anaconda3/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 353, in __call__
    raise TransportableException(text, e_type)
sklearn.externals.joblib.my_exceptions.TransportableException: TransportableException
___________________________________________________________________________
ValueError                                         Mon Jul  3 02:20:59 2017
PID: 22695                 Python 3.6.0: /home/zeerak/anaconda3/bin/python3
...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in __call__(self=<sklearn.externals.joblib.parallel.BatchedCalls object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        self.items = [(<function _fit_and_score>, (ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False), <5295x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>, ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], make_scorer(accuracy_score), array([ 530,  531,  532, ..., 5292, 5293, 5294]), array([  0,   1,   2,   3,   4,   5,   6,   7,  ...20, 521, 522, 523, 524, 525, 526, 527, 528, 529]), 0, {'alpha': 1, 'l1_ratio': 1}), {'error_score': 'raise', 'fit_params': {}, 'return_n_test_samples': True, 'return_parameters': True, 'return_times': True, 'return_train_score': True})]
    132
    133     def __len__(self):
    134         return self._size
    135

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in <listcomp>(.0=<list_iterator object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        func = <function _fit_and_score>
        args = (ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False), <5295x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>, ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], make_scorer(accuracy_score), array([ 530,  531,  532, ..., 5292, 5293, 5294]), array([  0,   1,   2,   3,   4,   5,   6,   7,  ...20, 521, 522, 523, 524, 525, 526, 527, 528, 529]), 0, {'alpha': 1, 'l1_ratio': 1})
        kwargs = {'error_score': 'raise', 'fit_params': {}, 'return_n_test_samples': True, 'return_parameters': True, 'return_times': True, 'return_train_score': True}
    132
    133     def __len__(self):
    134         return self._size
    135

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_validation.py in _fit_and_score(estimator=ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False), X=<5295x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>, y=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], scorer=make_scorer(accuracy_score), train=array([ 530,  531,  532, ..., 5292, 5293, 5294]), test=array([  0,   1,   2,   3,   4,   5,   6,   7,  ...20, 521, 522, 523, 524, 525, 526, 527, 528, 529]), verbose=0, parameters={'alpha': 1, 'l1_ratio': 1}, fit_params={}, return_train_score=True, return_parameters=True, return_n_test_samples=True, return_times=True, error_score='raise')
    255                              " numeric value. (Hint: if using 'raise', please"
    256                              " make sure that it has been spelled correctly.)")
    257
    258     else:
    259         fit_time = time.time() - start_time
--> 260         test_score = _score(estimator, X_test, y_test, scorer)
        test_score = undefined
        estimator = ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False)
        X_test = <530x86231 sparse matrix of type '<class 'numpy.... stored elements in Compressed Sparse Row format>
        y_test = ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...]
        scorer = make_scorer(accuracy_score)
    261         score_time = time.time() - start_time - fit_time
    262         if return_train_score:
    263             train_score = _score(estimator, X_train, y_train, scorer)
    264

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_validation.py in _score(estimator=ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False), X_test=<530x86231 sparse matrix of type '<class 'numpy.... stored elements in Compressed Sparse Row format>, y_test=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], scorer=make_scorer(accuracy_score))
    283 def _score(estimator, X_test, y_test, scorer):
    284     """Compute the score of an estimator on a given test set."""
    285     if y_test is None:
    286         score = scorer(estimator, X_test)
    287     else:
--> 288         score = scorer(estimator, X_test, y_test)
        score = undefined
        scorer = make_scorer(accuracy_score)
        estimator = ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False)
        X_test = <530x86231 sparse matrix of type '<class 'numpy.... stored elements in Compressed Sparse Row format>
        y_test = ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...]
    289     if hasattr(score, 'item'):
    290         try:
    291             # e.g. unwrap memmapped scalars
    292             score = score.item()

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/metrics/scorer.py in __call__(self=make_scorer(accuracy_score), estimator=ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False), X=<530x86231 sparse matrix of type '<class 'numpy.... stored elements in Compressed Sparse Row format>, y_true=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], sample_weight=None)
     93             return self._sign * self._score_func(y_true, y_pred,
     94                                                  sample_weight=sample_weight,
     95                                                  **self._kwargs)
     96         else:
     97             return self._sign * self._score_func(y_true, y_pred,
---> 98                                                  **self._kwargs)
        self._kwargs = {}
     99
    100
    101 class _ProbaScorer(_BaseScorer):
    102     def __call__(self, clf, X, y, sample_weight=None):

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/metrics/classification.py in accuracy_score(y_true=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], y_pred=array([ 0.5320042,  0.5320042,  0.5320042,  0.53...  0.5320042,  0.5320042,  0.5320042,  0.5320042]), normalize=True, sample_weight=None)
    167     >>> accuracy_score(np.array([[0, 1], [1, 1]]), np.ones((2, 2)))
    168     0.5
    169     """
    170
    171     # Compute accuracy for each possible representation
--> 172     y_type, y_true, y_pred = _check_targets(y_true, y_pred)
        y_type = undefined
        y_true = ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...]
        y_pred = array([ 0.5320042,  0.5320042,  0.5320042,  0.53...  0.5320042,  0.5320042,  0.5320042,  0.5320042])
    173     if y_type.startswith('multilabel'):
    174         differing_labels = count_nonzero(y_true - y_pred, axis=1)
    175         score = differing_labels == 0
    176     else:

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/metrics/classification.py in _check_targets(y_true=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], y_pred=array([ 0.5320042,  0.5320042,  0.5320042,  0.53...  0.5320042,  0.5320042,  0.5320042,  0.5320042]))
     77     if y_type == set(["binary", "multiclass"]):
     78         y_type = set(["multiclass"])
     79
     80     if len(y_type) > 1:
     81         raise ValueError("Can't handle mix of {0} and {1}"
---> 82                          "".format(type_true, type_pred))
        type_true = 'binary'
        type_pred = 'continuous'
     83
     84     # We can't have more than one value on y_type => The set is no more needed
     85     y_type = y_type.pop()
     86

ValueError: Can't handle mix of binary and continuous
___________________________________________________________________________
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 682, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/home/zeerak/anaconda3/lib/python3.6/multiprocessing/pool.py", line 608, in get
    raise self._value
sklearn.externals.joblib.my_exceptions.TransportableException: TransportableException
___________________________________________________________________________
ValueError                                         Mon Jul  3 02:20:59 2017
PID: 22695                 Python 3.6.0: /home/zeerak/anaconda3/bin/python3
...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in __call__(self=<sklearn.externals.joblib.parallel.BatchedCalls object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        self.items = [(<function _fit_and_score>, (ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False), <5295x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>, ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], make_scorer(accuracy_score), array([ 530,  531,  532, ..., 5292, 5293, 5294]), array([  0,   1,   2,   3,   4,   5,   6,   7,  ...20, 521, 522, 523, 524, 525, 526, 527, 528, 529]), 0, {'alpha': 1, 'l1_ratio': 1}), {'error_score': 'raise', 'fit_params': {}, 'return_n_test_samples': True, 'return_parameters': True, 'return_times': True, 'return_train_score': True})]
    132
    133     def __len__(self):
    134         return self._size
    135

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in <listcomp>(.0=<list_iterator object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        func = <function _fit_and_score>
        args = (ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False), <5295x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>, ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], make_scorer(accuracy_score), array([ 530,  531,  532, ..., 5292, 5293, 5294]), array([  0,   1,   2,   3,   4,   5,   6,   7,  ...20, 521, 522, 523, 524, 525, 526, 527, 528, 529]), 0, {'alpha': 1, 'l1_ratio': 1})
        kwargs = {'error_score': 'raise', 'fit_params': {}, 'return_n_test_samples': True, 'return_parameters': True, 'return_times': True, 'return_train_score': True}
    132
    133     def __len__(self):
    134         return self._size
    135

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_validation.py in _fit_and_score(estimator=ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False), X=<5295x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>, y=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], scorer=make_scorer(accuracy_score), train=array([ 530,  531,  532, ..., 5292, 5293, 5294]), test=array([  0,   1,   2,   3,   4,   5,   6,   7,  ...20, 521, 522, 523, 524, 525, 526, 527, 528, 529]), verbose=0, parameters={'alpha': 1, 'l1_ratio': 1}, fit_params={}, return_train_score=True, return_parameters=True, return_n_test_samples=True, return_times=True, error_score='raise')
    255                              " numeric value. (Hint: if using 'raise', please"
    256                              " make sure that it has been spelled correctly.)")
    257
    258     else:
    259         fit_time = time.time() - start_time
--> 260         test_score = _score(estimator, X_test, y_test, scorer)
        test_score = undefined
        estimator = ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False)
        X_test = <530x86231 sparse matrix of type '<class 'numpy.... stored elements in Compressed Sparse Row format>
        y_test = ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...]
        scorer = make_scorer(accuracy_score)
    261         score_time = time.time() - start_time - fit_time
    262         if return_train_score:
    263             train_score = _score(estimator, X_train, y_train, scorer)
    264

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_validation.py in _score(estimator=ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False), X_test=<530x86231 sparse matrix of type '<class 'numpy.... stored elements in Compressed Sparse Row format>, y_test=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], scorer=make_scorer(accuracy_score))
    283 def _score(estimator, X_test, y_test, scorer):
    284     """Compute the score of an estimator on a given test set."""
    285     if y_test is None:
    286         score = scorer(estimator, X_test)
    287     else:
--> 288         score = scorer(estimator, X_test, y_test)
        score = undefined
        scorer = make_scorer(accuracy_score)
        estimator = ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False)
        X_test = <530x86231 sparse matrix of type '<class 'numpy.... stored elements in Compressed Sparse Row format>
        y_test = ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...]
    289     if hasattr(score, 'item'):
    290         try:
    291             # e.g. unwrap memmapped scalars
    292             score = score.item()

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/metrics/scorer.py in __call__(self=make_scorer(accuracy_score), estimator=ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False), X=<530x86231 sparse matrix of type '<class 'numpy.... stored elements in Compressed Sparse Row format>, y_true=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], sample_weight=None)
     93             return self._sign * self._score_func(y_true, y_pred,
     94                                                  sample_weight=sample_weight,
     95                                                  **self._kwargs)
     96         else:
     97             return self._sign * self._score_func(y_true, y_pred,
---> 98                                                  **self._kwargs)
        self._kwargs = {}
     99
    100
    101 class _ProbaScorer(_BaseScorer):
    102     def __call__(self, clf, X, y, sample_weight=None):

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/metrics/classification.py in accuracy_score(y_true=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], y_pred=array([ 0.5320042,  0.5320042,  0.5320042,  0.53...  0.5320042,  0.5320042,  0.5320042,  0.5320042]), normalize=True, sample_weight=None)
    167     >>> accuracy_score(np.array([[0, 1], [1, 1]]), np.ones((2, 2)))
    168     0.5
    169     """
    170
    171     # Compute accuracy for each possible representation
--> 172     y_type, y_true, y_pred = _check_targets(y_true, y_pred)
        y_type = undefined
        y_true = ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...]
        y_pred = array([ 0.5320042,  0.5320042,  0.5320042,  0.53...  0.5320042,  0.5320042,  0.5320042,  0.5320042])
    173     if y_type.startswith('multilabel'):
    174         differing_labels = count_nonzero(y_true - y_pred, axis=1)
    175         score = differing_labels == 0
    176     else:

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/metrics/classification.py in _check_targets(y_true=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], y_pred=array([ 0.5320042,  0.5320042,  0.5320042,  0.53...  0.5320042,  0.5320042,  0.5320042,  0.5320042]))
     77     if y_type == set(["binary", "multiclass"]):
     78         y_type = set(["multiclass"])
     79
     80     if len(y_type) > 1:
     81         raise ValueError("Can't handle mix of {0} and {1}"
---> 82                          "".format(type_true, type_pred))
        type_true = 'binary'
        type_pred = 'continuous'
     83
     84     # We can't have more than one value on y_type => The set is no more needed
     85     y_type = y_type.pop()
     86

ValueError: Can't handle mix of binary and continuous
___________________________________________________________________________

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "run.py", line 333, in <module>
    runner(args)
  File "run.py", line 283, in runner
    feats.run_gridsearch_elastic(X_train, X_test, Y_train, Y_test)
  File "/home/zeerak/Projects/repos/gender_spectrum/features.py", line 216, in run_gridsearch_elastic
    clf.fit(X_train, y_train)
  File "/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 945, in fit
    return self._fit(X, y, groups, ParameterGrid(self.param_grid))
  File "/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 564, in _fit
    for parameters in parameter_iterable
  File "/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 768, in __call__
    self.retrieve()
  File "/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 719, in retrieve
    raise exception
sklearn.externals.joblib.my_exceptions.JoblibValueError: JoblibValueError
___________________________________________________________________________
Multiprocessing exception:
...........................................................................
/home/zeerak/Projects/repos/gender_spectrum/run.py in <module>()
    328     parser.add_argument('--probs'     , action = 'store_true', help = argparse.SUPPRESS)
    329     parser.add_argument('--sgdparams' , nargs = 2            , help = argparse.SUPPRESS)
    330
    331     args = parser.parse_args()
    332
--> 333     runner(args)
    334
    335
    336
    337

...........................................................................
/home/zeerak/Projects/repos/gender_spectrum/run.py in runner(args=Namespace(data=None, feats=['unigram-counts'], h...dparams=None, test=None, user=True, workers=None))
    278             feats.run_gridsearch_sgd(X_train, X_test, Y_train, Y_test)
    279             # preds, weights, intercept = feats.run_sgd(X_train, X_test, Y_train, Y_test)
    280
    281         elif args.method == 'Elastic':
    282             print('Run Elastic Net...', file=sys.stdout)
--> 283             feats.run_gridsearch_elastic(X_train, X_test, Y_train, Y_test)
        X_train = <5295x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>
        X_test = <1324x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>
        Y_train = ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...]
        Y_test = ['1', '0', '1', '1', '1', '1', '0', '1', '0', '0', '0', '1', '1', '0', '1', '1', '0', '0', '0', '0', ...]
    284             # preds, weights, intercept = feats.run_sgd(X_train, X_test, Y_train, Y_test)
    285
    286
    287 if __name__ == "__main__":

...........................................................................
/home/zeerak/Projects/repos/gender_spectrum/features.py in run_gridsearch_elastic(X_train=<5295x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>, X_test=<1324x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>, y_train=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], y_test=['1', '0', '1', '1', '1', '1', '0', '1', '0', '0', '0', '1', '1', '0', '1', '1', '0', '0', '0', '0', ...])
    211 def run_gridsearch_elastic(X_train, X_test, y_train, y_test):
    212     vals = [10 ** i for i in range(0, -10, -1)] + list(np.arange(0.1, 0.999, 0.015))
    213     parameters = {'l1_ratio': vals, 'alpha': vals}
    214     elastic = ElasticNet(max_iter = 1000)
    215     clf = GridSearchCV(elastic, param_grid = parameters, n_jobs = 8, scoring = 'accuracy', cv = 10)
--> 216     clf.fit(X_train, y_train)
        clf.fit = <bound method GridSearchCV.fit of GridSearchCV(c...core=True,
       scoring='accuracy', verbose=0)>
        X_train = <5295x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>
        y_train = ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...]
    217     preds = clf.predict(X_test)
    218     print(clf.best_estimator_, clf.best_score_, clf.best_params_)
    219     print("Accuracy: %s" % accuracy_score(y_test, preds))
    220

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_search.py in fit(self=GridSearchCV(cv=10, error_score='raise',
       ...score=True,
       scoring='accuracy', verbose=0), X=<5295x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>, y=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], groups=None)
    940
    941         groups : array-like, with shape (n_samples,), optional
    942             Group labels for the samples used while splitting the dataset into
    943             train/test set.
    944         """
--> 945         return self._fit(X, y, groups, ParameterGrid(self.param_grid))
        self._fit = <bound method BaseSearchCV._fit of GridSearchCV(...core=True,
       scoring='accuracy', verbose=0)>
        X = <5295x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>
        y = ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...]
        groups = None
        self.param_grid = {'alpha': [1, 0.1, 0.01, 0.001, 0.0001, 1e-05, 1e-06, 1e-07, 1e-08, 1e-09, 0.10000000000000001, 0.115, 0.13, 0.14500000000000002, 0.16, 0.17499999999999999, 0.19, 0.20500000000000002, 0.22, 0.23500000000000001, ...], 'l1_ratio': [1, 0.1, 0.01, 0.001, 0.0001, 1e-05, 1e-06, 1e-07, 1e-08, 1e-09, 0.10000000000000001, 0.115, 0.13, 0.14500000000000002, 0.16, 0.17499999999999999, 0.19, 0.20500000000000002, 0.22, 0.23500000000000001, ...]}
    946
    947
    948 class RandomizedSearchCV(BaseSearchCV):
    949     """Randomized search on hyper parameters.

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_search.py in _fit(self=GridSearchCV(cv=10, error_score='raise',
       ...score=True,
       scoring='accuracy', verbose=0), X=<5295x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>, y=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], groups=None, parameter_iterable=<sklearn.model_selection._search.ParameterGrid object>)
    559                                   fit_params=self.fit_params,
    560                                   return_train_score=self.return_train_score,
    561                                   return_n_test_samples=True,
    562                                   return_times=True, return_parameters=True,
    563                                   error_score=self.error_score)
--> 564           for parameters in parameter_iterable
        parameters = undefined
        parameter_iterable = <sklearn.model_selection._search.ParameterGrid object>
    565           for train, test in cv_iter)
    566
    567         # if one choose to see train score, "out" will contain train score info
    568         if self.return_train_score:

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in __call__(self=Parallel(n_jobs=8), iterable=<generator object BaseSearchCV._fit.<locals>.<genexpr>>)
    763             if pre_dispatch == "all" or n_jobs == 1:
    764                 # The iterable was consumed all at once by the above for loop.
    765                 # No need to wait for async callbacks to trigger to
    766                 # consumption.
    767                 self._iterating = False
--> 768             self.retrieve()
        self.retrieve = <bound method Parallel.retrieve of Parallel(n_jobs=8)>
    769             # Make sure that we get a last message telling us we are done
    770             elapsed_time = time.time() - self._start_time
    771             self._print('Done %3i out of %3i | elapsed: %s finished',
    772                         (len(self._output), len(self._output),

Sub-process traceback:

ValueError                                         Mon Jul  3 02:20:59 2017
PID: 22695                 Python 3.6.0: /home/zeerak/anaconda3/bin/python3
...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in __call__(self=<sklearn.externals.joblib.parallel.BatchedCalls object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        self.items = [(<function _fit_and_score>, (ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False), <5295x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>, ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], make_scorer(accuracy_score), array([ 530,  531,  532, ..., 5292, 5293, 5294]), array([  0,   1,   2,   3,   4,   5,   6,   7,  ...20, 521, 522, 523, 524, 525, 526, 527, 528, 529]), 0, {'alpha': 1, 'l1_ratio': 1}), {'error_score': 'raise', 'fit_params': {}, 'return_n_test_samples': True, 'return_parameters': True, 'return_times': True, 'return_train_score': True})]
    132
    133     def __len__(self):
    134         return self._size
    135

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in <listcomp>(.0=<list_iterator object>)
    126     def __init__(self, iterator_slice):
    127         self.items = list(iterator_slice)
    128         self._size = len(self.items)
    129
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        func = <function _fit_and_score>
        args = (ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False), <5295x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>, ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], make_scorer(accuracy_score), array([ 530,  531,  532, ..., 5292, 5293, 5294]), array([  0,   1,   2,   3,   4,   5,   6,   7,  ...20, 521, 522, 523, 524, 525, 526, 527, 528, 529]), 0, {'alpha': 1, 'l1_ratio': 1})
        kwargs = {'error_score': 'raise', 'fit_params': {}, 'return_n_test_samples': True, 'return_parameters': True, 'return_times': True, 'return_train_score': True}
    132
    133     def __len__(self):
    134         return self._size
    135

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_validation.py in _fit_and_score(estimator=ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False), X=<5295x86231 sparse matrix of type '<class 'numpy... stored elements in Compressed Sparse Row format>, y=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], scorer=make_scorer(accuracy_score), train=array([ 530,  531,  532, ..., 5292, 5293, 5294]), test=array([  0,   1,   2,   3,   4,   5,   6,   7,  ...20, 521, 522, 523, 524, 525, 526, 527, 528, 529]), verbose=0, parameters={'alpha': 1, 'l1_ratio': 1}, fit_params={}, return_train_score=True, return_parameters=True, return_n_test_samples=True, return_times=True, error_score='raise')
    255                              " numeric value. (Hint: if using 'raise', please"
    256                              " make sure that it has been spelled correctly.)")
    257
    258     else:
    259         fit_time = time.time() - start_time
--> 260         test_score = _score(estimator, X_test, y_test, scorer)
        test_score = undefined
        estimator = ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False)
        X_test = <530x86231 sparse matrix of type '<class 'numpy.... stored elements in Compressed Sparse Row format>
        y_test = ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...]
        scorer = make_scorer(accuracy_score)
    261         score_time = time.time() - start_time - fit_time
    262         if return_train_score:
    263             train_score = _score(estimator, X_train, y_train, scorer)
    264

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_validation.py in _score(estimator=ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False), X_test=<530x86231 sparse matrix of type '<class 'numpy.... stored elements in Compressed Sparse Row format>, y_test=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], scorer=make_scorer(accuracy_score))
    283 def _score(estimator, X_test, y_test, scorer):
    284     """Compute the score of an estimator on a given test set."""
    285     if y_test is None:
    286         score = scorer(estimator, X_test)
    287     else:
--> 288         score = scorer(estimator, X_test, y_test)
        score = undefined
        scorer = make_scorer(accuracy_score)
        estimator = ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False)
        X_test = <530x86231 sparse matrix of type '<class 'numpy.... stored elements in Compressed Sparse Row format>
        y_test = ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...]
    289     if hasattr(score, 'item'):
    290         try:
    291             # e.g. unwrap memmapped scalars
    292             score = score.item()

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/metrics/scorer.py in __call__(self=make_scorer(accuracy_score), estimator=ElasticNet(alpha=1, copy_X=True, fit_intercept=T...selection='cyclic', tol=0.0001, warm_start=False), X=<530x86231 sparse matrix of type '<class 'numpy.... stored elements in Compressed Sparse Row format>, y_true=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], sample_weight=None)
     93             return self._sign * self._score_func(y_true, y_pred,
     94                                                  sample_weight=sample_weight,
     95                                                  **self._kwargs)
     96         else:
     97             return self._sign * self._score_func(y_true, y_pred,
---> 98                                                  **self._kwargs)
        self._kwargs = {}
     99
    100
    101 class _ProbaScorer(_BaseScorer):
    102     def __call__(self, clf, X, y, sample_weight=None):

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/metrics/classification.py in accuracy_score(y_true=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], y_pred=array([ 0.5320042,  0.5320042,  0.5320042,  0.53...  0.5320042,  0.5320042,  0.5320042,  0.5320042]), normalize=True, sample_weight=None)
    167     >>> accuracy_score(np.array([[0, 1], [1, 1]]), np.ones((2, 2)))
    168     0.5
    169     """
    170
    171     # Compute accuracy for each possible representation
--> 172     y_type, y_true, y_pred = _check_targets(y_true, y_pred)
        y_type = undefined
        y_true = ['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...]
        y_pred = array([ 0.5320042,  0.5320042,  0.5320042,  0.53...  0.5320042,  0.5320042,  0.5320042,  0.5320042])
    173     if y_type.startswith('multilabel'):
    174         differing_labels = count_nonzero(y_true - y_pred, axis=1)
    175         score = differing_labels == 0
    176     else:

...........................................................................
/home/zeerak/anaconda3/lib/python3.6/site-packages/sklearn/metrics/classification.py in _check_targets(y_true=['0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '0', ...], y_pred=array([ 0.5320042,  0.5320042,  0.5320042,  0.53...  0.5320042,  0.5320042,  0.5320042,  0.5320042]))
     77     if y_type == set(["binary", "multiclass"]):
     78         y_type = set(["multiclass"])
     79
     80     if len(y_type) > 1:
     81         raise ValueError("Can't handle mix of {0} and {1}"
---> 82                          "".format(type_true, type_pred))
        type_true = 'binary'
        type_pred = 'continuous'
     83
     84     # We can't have more than one value on y_type => The set is no more needed
     85     y_type = y_type.pop()
     86

ValueError: Can't handle mix of binary and continuous
___________________________________________________________________________

Versions

platform.platform(): Linux-4.2.3-300.fc23.x86_64-x86_64-with-fedora-25-Twenty_Five sys.version: Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 12:22:00) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] numpy.version: NumPy 1.11.3 scipy.version: SciPy 0.18.1 sklearn.version: Scikit-Learn 0.18.1

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
jnothmancommented, Jul 4, 2017

I’ll try to improve the error message from “Can’t handle mix of binary and continuous” to “Classification metrics can’t handle mix of binary and continuous targets”.

On 4 July 2017 at 10:38, Zeerak Waseem notifications@github.com wrote:

@GaelVaroquaux https://github.com/gaelvaroquaux Yup, it only occurs with n_jobs > 1. Fixing the errors with the invalid y does not change it. However, the scoring fixed it. Had missed that I hadn’t updated the scoring.

@jnothman https://github.com/jnothman Thanks for that, this was the issue! Not entirely sure why this brought out the issue though.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/scikit-learn/scikit-learn/issues/9264#issuecomment-312754802, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEz6_QINNi-wSufktWL2HIsiekZr3m_ks5sKYmagaJpZM4OLwDb .

1reaction
GaelVaroquauxcommented, Jul 3, 2017

Does the problem go away when using n_jobs=1?

Looking at the errors it seems that the error is due to an invalid y.

Read more comments on GitHub >

github_iconTop Results From Across the Web

python - Stuck parallelisation with sklearn with large number ...
Reason. Parallelization in this case is based on copying all the data and send a copy to each of the different parallel processes...
Read more >
Release history — scikit-learn 0.18.2 documentation
GridSearchCV meta-estimator with n_jobs > 1 used with a large grid of parameters on a small dataset. By Vlad Niculae, Olivier Grisel and...
Read more >
Stochastic Gradient Boosting with XGBoost and scikit-learn in ...
This can result in trees that use the same attributes and even the same split points again and again. Bagging is a technique...
Read more >
Predicting back pain with sklearn and GridSearchCV | Kaggle
I'm going to focus on these main algorithms for binary classification problems: Decision trees; Logisitic regression; Support Vector Machines. If you are ...
Read more >
sklearn.model_selection.GridSearchCV - Vighnesh Birodkar
If n_jobs was set to a value higher than one, the data is copied for each point in the grid (and not n_jobs...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found