Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

type error when fitting X,y

See original GitHub issue

Hi All,

I have created the training set for machine learning and when trying to fit model it gives a value error.

The code and error is as follows.

Code:

search.fit(train_final[X_cols_train], train_final['target'])

output:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-105-9dea7c488bbf> in <module>
----> 1 search.fit(train_final[X_cols_train], train_final['target'])

~/env/lib/python3.5/site-packages/dask_ml/model_selection/_incremental.py in fit(self, X, y, **fit_params)
    572             Additional partial fit keyword arguments for the estimator.
    573         """
--> 574         return default_client().sync(self._fit, X, y, **fit_params)
    575 
    576     @if_delegate_has_method(delegate=("best_estimator_", "estimator"))

~/env/lib/python3.5/site-packages/distributed/client.py in sync(self, func, *args, **kwargs)
    671             return future
    672         else:
--> 673             return sync(self.loop, func, *args, **kwargs)
    674 
    675     def __repr__(self):

~/env/lib/python3.5/site-packages/distributed/utils.py in sync(loop, func, *args, **kwargs)
    275             e.wait(10)
    276     if error[0]:
--> 277         six.reraise(*error[0])
    278     else:
    279         return result[0]

~/env/lib/python3.5/site-packages/six.py in reraise(tp, value, tb)
    691             if value.__traceback__ is not tb:
    692                 raise value.with_traceback(tb)
--> 693             raise value
    694         finally:
    695             value = None

~/env/lib/python3.5/site-packages/distributed/utils.py in f()
    260             if timeout is not None:
    261                 future = gen.with_timeout(timedelta(seconds=timeout), future)
--> 262             result[0] = yield future
    263         except Exception as exc:
    264             error[0] = sys.exc_info()

~/env/lib/python3.5/site-packages/tornado/gen.py in run(self)
   1131 
   1132                     try:
-> 1133                         value = future.result()
   1134                     except Exception:
   1135                         self.had_exception = True

/usr/lib/python3.5/asyncio/futures.py in result(self)
    291             self._tb_logger = None
    292         if self._exception is not None:
--> 293             raise self._exception
    294         return self._result
    295 

~/env/lib/python3.5/site-packages/tornado/gen.py in wrapper(*args, **kwargs)
    324                 try:
    325                     orig_stack_contexts = stack_context._state.contexts
--> 326                     yielded = next(result)
    327                     if stack_context._state.contexts is not orig_stack_contexts:
    328                         yielded = _create_future()

~/env/lib/python3.5/site-packages/dask_ml/model_selection/_incremental.py in _fit(self, X, y, **fit_params)
    522     @gen.coroutine
    523     def _fit(self, X, y, **fit_params):
--> 524         X, y = self._check_array(X, y)
    525 
    526         X_train, X_test, y_train, y_test = self._get_train_test_split(X, y)

~/env/lib/python3.5/site-packages/dask_ml/model_selection/_incremental.py in _check_array(self, X, y, **kwargs)
    437         if isinstance(y, np.ndarray):
    438             y = da.from_array(y, y.shape)
--> 439         X = check_array(X, **kwargs)
    440         kwargs["ensure_2d"] = False
    441         y = check_array(y, **kwargs)

~/env/lib/python3.5/site-packages/dask_ml/utils.py in check_array(array, *args, **kwargs)
    149     elif isinstance(array, dd.DataFrame):
    150         if not accept_dask_dataframe:
--> 151             raise TypeError("This estimator does not support dask dataframes.")
    152         # TODO: sample?
    153         return array

TypeError: This estimator does not support dask dataframes.

Also I would like to know when fitting data do we need to re-code string values in columns (in categorical data) to numerical data. For example if there is a column with categories a,b,c do we have to re-code it as for example as 1,2,3. Furthermore if we need to do so then how do we make sure that the test set also is re-coded in the same pattern such as 1 for a and 2 for b.

Thank you

Michael

Issue Analytics

State:
Created 5 years ago
Comments:10 (5 by maintainers)

Top GitHub Comments

1reaction

MichaelSchrotercommented, Mar 12, 2019

Thanks That helped

1reaction

atyamsriharshacommented, Mar 12, 2019

Hi All,

I have created the training set for machine learning and when trying to fit model it gives a value error.

The code and error is as follows.

Code:

search.fit(train_final[X_cols_train], train_final['target'])

output:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-105-9dea7c488bbf> in <module>
----> 1 search.fit(train_final[X_cols_train], train_final['target'])

~/env/lib/python3.5/site-packages/dask_ml/model_selection/_incremental.py in fit(self, X, y, **fit_params)
    572             Additional partial fit keyword arguments for the estimator.
    573         """
--> 574         return default_client().sync(self._fit, X, y, **fit_params)
    575 
    576     @if_delegate_has_method(delegate=("best_estimator_", "estimator"))

~/env/lib/python3.5/site-packages/distributed/client.py in sync(self, func, *args, **kwargs)
    671             return future
    672         else:
--> 673             return sync(self.loop, func, *args, **kwargs)
    674 
    675     def __repr__(self):

~/env/lib/python3.5/site-packages/distributed/utils.py in sync(loop, func, *args, **kwargs)
    275             e.wait(10)
    276     if error[0]:
--> 277         six.reraise(*error[0])
    278     else:
    279         return result[0]

~/env/lib/python3.5/site-packages/six.py in reraise(tp, value, tb)
    691             if value.__traceback__ is not tb:
    692                 raise value.with_traceback(tb)
--> 693             raise value
    694         finally:
    695             value = None

~/env/lib/python3.5/site-packages/distributed/utils.py in f()
    260             if timeout is not None:
    261                 future = gen.with_timeout(timedelta(seconds=timeout), future)
--> 262             result[0] = yield future
    263         except Exception as exc:
    264             error[0] = sys.exc_info()

~/env/lib/python3.5/site-packages/tornado/gen.py in run(self)
   1131 
   1132                     try:
-> 1133                         value = future.result()
   1134                     except Exception:
   1135                         self.had_exception = True

/usr/lib/python3.5/asyncio/futures.py in result(self)
    291             self._tb_logger = None
    292         if self._exception is not None:
--> 293             raise self._exception
    294         return self._result
    295 

~/env/lib/python3.5/site-packages/tornado/gen.py in wrapper(*args, **kwargs)
    324                 try:
    325                     orig_stack_contexts = stack_context._state.contexts
--> 326                     yielded = next(result)
    327                     if stack_context._state.contexts is not orig_stack_contexts:
    328                         yielded = _create_future()

~/env/lib/python3.5/site-packages/dask_ml/model_selection/_incremental.py in _fit(self, X, y, **fit_params)
    522     @gen.coroutine
    523     def _fit(self, X, y, **fit_params):
--> 524         X, y = self._check_array(X, y)
    525 
    526         X_train, X_test, y_train, y_test = self._get_train_test_split(X, y)

~/env/lib/python3.5/site-packages/dask_ml/model_selection/_incremental.py in _check_array(self, X, y, **kwargs)
    437         if isinstance(y, np.ndarray):
    438             y = da.from_array(y, y.shape)
--> 439         X = check_array(X, **kwargs)
    440         kwargs["ensure_2d"] = False
    441         y = check_array(y, **kwargs)

~/env/lib/python3.5/site-packages/dask_ml/utils.py in check_array(array, *args, **kwargs)
    149     elif isinstance(array, dd.DataFrame):
    150         if not accept_dask_dataframe:
--> 151             raise TypeError("This estimator does not support dask dataframes.")
    152         # TODO: sample?
    153         return array

TypeError: This estimator does not support dask dataframes.

Thank you

Michael

To make sure that the test set is also encoded in the same way as the train set we ideally fit the label encoder on test and train data combined and then transform the data in test and train separately. Hope this helps

Top Results From Across the Web

Classifier.fit(X,y) error - Stack Overflow

I'm trying some machine learning algorithms. I'm using sklearn tool for logistic regression script. this is my script: import numpy as np from ......

Estimating The Error In Fit Parameters - YouTube

This is a very quick introduction to finding the error in fit parameters as well as the propagation of error in measurement to...

15.2.2 The Linear Fit with X Error Dialog (Pro Only) - OriginLab

The X values of the fitted curve are plotted using the scale type of the source graph. This option is available only when...

R Error in lm.fit(x, y, offset, singular.ok, ...) : NA/NaN/Inf in 'x' (2 ...

Example 2: Wrong Target Variable in Linear Regression Model. Another reason why the error message “Error in lm.fit(x, y, offset = offset, singular.ok...

Curve Fitting With Python - MachineLearningMastery.com

Curve fitting is a type of optimization that finds an optimal set of ... the parameters to the function that result in the...