Unable to reproduce OneHotEncoder example from the docs
See original GitHub issueThis example from the API reference returns an error:
from dask_ml.preprocessing import OneHotEncoder
import numpy as np
import dask.array as da
enc = OneHotEncoder()
X = da.from_array(np.array([['A'], ['B'], ['A'], ['C']]), chunks=2)
enc.fit(X)
enc.categories_
enc.transform(X)
This is the traceback:
ValueErrorTraceback (most recent call last)
<ipython-input-1-f54891b18539> in <module>
6 enc.fit(X)
7 enc.categories_
----> 8 enc.transform(X)
~/myenv/lib/python3.7/site-packages/dask_ml/preprocessing/_encoders.py in transform(self, X)
211 self, X: Union[ArrayLike, DataFrameType]
212 ) -> Union[ArrayLike, DataFrameType]:
--> 213 return self._transform(X)
214
215 def _transform_new(
~/myenv/lib/python3.7/site-packages/dask_ml/preprocessing/_encoders.py in _transform(self, X, handle_unknown)
243 for i in range(n_features)
244 ]
--> 245 X = da.concatenate(Xs, axis=1)
246
247 if not self.sparse:
~/myenv/lib/python3.7/site-packages/dask/array/core.py in concatenate(seq, axis, allow_unknown_chunksizes)
3480 raise ValueError("Need array(s) to concatenate")
3481
-> 3482 meta = np.concatenate([meta_from_array(s) for s in seq], axis=axis)
3483
3484 # Promote types to match meta
<__array_function__ internals> in concatenate(*args, **kwargs)
ValueError: zero-dimensional arrays cannot be concatenated
Environment:
- Dask version: 2.19.0
- numpy version: 1.18.5
- Python version: 3.7.6
- Install method (conda, pip, source): conda
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
Behavior of OneHotEncoder handle_unknown option #92
This way we give the user an opportunity to handle missing values any way they want to. indicator We do not want to...
Read more >sklearn.preprocessing.OneHotEncoder
Transforms between iterable of iterables and a multilabel format, e.g. a (samples x classes) binary matrix indicating the presence of a class label....
Read more >How do I encode categorical features using scikit-learn?
Become a member ($5/month): https://www.patreon.com/dataschool === RELATED RESOURCES === OneHotEncoder documentation : ...
Read more >scikit learn - Issue with OneHotEncoder for categorical features
This check fails if any of the data in the provided dataframe X cannot be successfully converted to a float. I agree that...
Read more >One-Hot Encoding in Scikit-Learn with OneHotEncoder - Datagy
In this tutorial, you'll learn how to use the OneHotEncoder class in Scikit-Learn to one hot encode your categorical data in sklearn.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
https://github.com/dask/dask/pull/6339 was just merged, so going to go ahead and close.
For now, you can install Dask from GitHub, or we’ll have a release in a week or two.
Just FYI @SultanOrazbayev, this may be being fixed upstream in https://github.com/dask/dask/pull/6339.