question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Allow stacking pandas dataframes in ColumnTransformer?

See original GitHub issue

Right now we make it easy to overwrite _hstack in ColumnTransformer, but I wonder if we should try to be more generic in the first place. I don’t think without something like nep 37 we can do fully generic stacking, but maybe we can already allow stacking pandas dataframes?

I can think of few cases where all the transformers output pandas dataframes but you want to output a numpy array in the ColumnTransformer. This would be a backward incompatible change, so we’d probably need to make it optional (or deprecate the current behavior). Wdyt?

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:1
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
jnothmancommented, May 4, 2021

Sorry, perhaps I mis-parsed your language. You’re saying “You’d rarely want a numpy array from transformers outputting dataframes”.

I think stacking frames into frames is good, at least where the indexes are all identical. we could put it into 1.0 without deprecation if we really want.

0reactions
jnothmancommented, May 5, 2021

I think that’s what we are assuming at the moment

Read more comments on GitHub >

github_iconTop Results From Across the Web

how to use ColumnTransformer() to return a dataframe?
From sklearn version 1.2 on, transformers can return a pandas DataFrame directly without further handling. It is done with set_output ...
Read more >
sklearn.compose.ColumnTransformer
Applies transformers to columns of an array or pandas DataFrame. This estimator allows different columns or column subsets of the input to be...
Read more >
How to Use the ColumnTransformer for Data Preparation
The ColumnTransformer is a class in the scikit-learn Python machine learning library that allows you to selectively apply data preparation ...
Read more >
pandas.DataFrame.stack — pandas 1.5.2 documentation
Stack the prescribed level(s) from columns to index. Return a reshaped DataFrame or Series having a multi-level index with one or more new...
Read more >
Is there a way to force a transformer to return a pandas ...
I am having issues with scikit-learn converting dataframes to numpy arrays. For instance, the following code from sklearn.impute import ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found