question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to reproduce OneHotEncoder example from the docs

See original GitHub issue

This example from the API reference returns an error:

from dask_ml.preprocessing import OneHotEncoder
import numpy as np
import dask.array as da
enc = OneHotEncoder()
X = da.from_array(np.array([['A'], ['B'], ['A'], ['C']]), chunks=2)
enc.fit(X)
enc.categories_
enc.transform(X)

This is the traceback:

ValueErrorTraceback (most recent call last)
<ipython-input-1-f54891b18539> in <module>
      6 enc.fit(X)
      7 enc.categories_
----> 8 enc.transform(X)

~/myenv/lib/python3.7/site-packages/dask_ml/preprocessing/_encoders.py in transform(self, X)
    211         self, X: Union[ArrayLike, DataFrameType]
    212     ) -> Union[ArrayLike, DataFrameType]:
--> 213         return self._transform(X)
    214 
    215     def _transform_new(

~/myenv/lib/python3.7/site-packages/dask_ml/preprocessing/_encoders.py in _transform(self, X, handle_unknown)
    243                 for i in range(n_features)
    244             ]
--> 245             X = da.concatenate(Xs, axis=1)
    246 
    247             if not self.sparse:

~/myenv/lib/python3.7/site-packages/dask/array/core.py in concatenate(seq, axis, allow_unknown_chunksizes)
   3480         raise ValueError("Need array(s) to concatenate")
   3481 
-> 3482     meta = np.concatenate([meta_from_array(s) for s in seq], axis=axis)
   3483 
   3484     # Promote types to match meta

<__array_function__ internals> in concatenate(*args, **kwargs)

ValueError: zero-dimensional arrays cannot be concatenated

Environment:

  • Dask version: 2.19.0
  • numpy version: 1.18.5
  • Python version: 3.7.6
  • Install method (conda, pip, source): conda

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
TomAugspurgercommented, Jun 23, 2020

https://github.com/dask/dask/pull/6339 was just merged, so going to go ahead and close.

For now, you can install Dask from GitHub, or we’ll have a release in a week or two.

1reaction
TomAugspurgercommented, Jun 22, 2020

Just FYI @SultanOrazbayev, this may be being fixed upstream in https://github.com/dask/dask/pull/6339.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Behavior of OneHotEncoder handle_unknown option #92
This way we give the user an opportunity to handle missing values any way they want to. indicator We do not want to...
Read more >
sklearn.preprocessing.OneHotEncoder
Transforms between iterable of iterables and a multilabel format, e.g. a (samples x classes) binary matrix indicating the presence of a class label....
Read more >
How do I encode categorical features using scikit-learn?
Become a member ($5/month): https://www.patreon.com/dataschool === RELATED RESOURCES === OneHotEncoder documentation : ...
Read more >
scikit learn - Issue with OneHotEncoder for categorical features
This check fails if any of the data in the provided dataframe X cannot be successfully converted to a float. I agree that...
Read more >
One-Hot Encoding in Scikit-Learn with OneHotEncoder - Datagy
In this tutorial, you'll learn how to use the OneHotEncoder class in Scikit-Learn to one hot encode your categorical data in sklearn.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found