Feature Discussion: FeatureUnion with axis choosing
See original GitHub issueHi,
I am using FeatureUnion
for a transformation Pipeline.
FeatureUnion
is always using np.hstack
to concatenate the results. See here:
https://github.com/scikit-learn/scikit-learn/blob/bac89c2/sklearn/pipeline.py#L829
I would like to discuss the following idear:
What about using numpy.concatenate
and let the user choose the axis with a maybe optional parameter?
What do you think about this? I can do a PR but wanted to discuss this before I start coding.
Thanks Philip
Issue Analytics
- State:
- Created 5 years ago
- Comments:10 (3 by maintainers)
Top Results From Across the Web
sklearn.pipeline.FeatureUnion
This estimator applies a list of transformer objects in parallel to the input data, then concatenates the results. This is useful to combine...
Read more >Scikit-learn utility function to select specific columns in a pipeline
The ColumnSelector can be used for "manual" feature selection, e.g., as part of a grid search via a scikit-learn pipeline.
Read more >Feature selection (MDI, PERM, RFE) in depth review - Kaggle
In this notebook, we are going to look at tree-based impurity reduction feature importance, permutation feature importance and reccurent feature elimination.
Read more >3 Healthcare: Diagnosing COVID-19 - liveBook · Manning
Analyzing tabular data to judge which feature engineering techniques are going to help; Implementing feature improvement, construction, and selection ...
Read more >Time-related feature engineering with scikit-learn - Neuraxle
... using the Pipeline class, as well as ColumnTransformer and FeatureUnion . ... on the test set in the discussion, we instead choose...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
pipegraph and nimbusml both are more flexible.
But there’s no reason why you can’t use scikit-learn’s pipeline for this with your own transformer. Looks like you implemented it already. The pipeline is completely agnostic in terms of dimensionality of the data. The issue is more with cross-validation. If you think it’s more widely useful you can publish your implementation yourself, like many other estimators in the scikit-learn-contrib org.
Well - closing it…