ColumnTransformer not dropping string columns after encoding without Pipeline
See original GitHub issueDescription
ColumnTransformer works properly when the transformer is a Pipeline, but not if its a list of estimators.
Steps/Code to Reproduce
df = pd.DataFrame({'a': ['v1', 'v2'],
'b': ['v1', 'v2'],
'c': [1, 2]})
cols = ['a', 'b']
imp = ('imp', SimpleImputer(strategy='constant'), cols)
ohe = ('ohe', OneHotEncoder(sparse=False), cols)
transformers = [imp, ohe]
ct = ColumnTransformer(transformers)
ct.fit_transform(df)
Expected Results
array([[1., 0., 1., 0.],
[0., 1., 0., 1.]])
Actual Results
array([['v1', 'v1', 1.0, 0.0, 1.0, 0.0],
['v2', 'v2', 0.0, 1.0, 0.0, 1.0]], dtype=object)
Correct results produced with a Pipeline
imp2 = ('imp', SimpleImputer(strategy='constant'))
ohe2 = ('ohe', OneHotEncoder(sparse=False))
steps = [imp2, ohe2]
pipe = Pipeline(steps)
transformers2 = [('cat', pipe, cols)]
ct = ColumnTransformer(transformers2)
ct.fit_transform(df)
Versions
Darwin-17.7.0-x86_64-i386-64bit Python 3.6.4 |Anaconda custom (64-bit)| (default, Mar 12 2018, 20:05:31) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] NumPy 1.14.3 SciPy 1.1.0 Scikit-Learn 0.20rc1
Issue Analytics
- State:
- Created 5 years ago
- Comments:6 (5 by maintainers)
Top Results From Across the Web
ColumnTransformer & Pipeline with OHE - Stack Overflow
I believe this remainder= is not relevant to the field being OneHot Encoded. I would like to know how is the OHE field...
Read more >How to Use the ColumnTransformer for Data Preparation
Applying data transforms like scaling or encoding categorical ... Any columns not specified in the list of “transformers” are dropped from ...
Read more >Column Transformer with Mixed Types - Scikit-learn
This example illustrates how to apply different preprocessing and feature extraction pipelines to different subsets of features, using ColumnTransformer.
Read more >5. Preprocessing Categorical Features and Column Transformer
The pipeline does not like the categorical column. scikit-learn only ... Now we see that after one-hot encoding we only get a single...
Read more >Passthrough some columns and drop others in a ... - YouTube
In a ColumnTransformer, you can use the strings 'passthrough' and ' drop ' in place of a transformer. Useful if you need to...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
“the features generated by each transformer will be concatenated to form the output feature matrix”
I’m working on this.