How to handle a pipeline for a hierarchical setup?
See original GitHub issueHi,
I’me currently dealing with the following problem. I have a hierarchy of labels in the form:
labels = {
"parent1":
{"child1", "child2"},
"parent2": {"child1", "child2", "child3"}
}
My pipeline consists on a first parents classifier model and then I have a classifier for each parent’s child labels. I will put it as simple as I can:
class ParentLabelsModel(BaseEstimator, TransformerMixin):
def __init__(self, base_estimator):
TransformerMixin.__init__(self)
BaseEstimator.__init__(self)
self.base_estimator = base_estimator()
def fit(self, X, y, *args):
self.base_estimator.fit(X, y)
def fit_transform(self, X, y, *args):
self.base_estimator.fit(X, y)
return X
def transform(self, X, *args):
y = self.base_estimator.predict(X)
return X, y
class ChildLabelsModel(BaseEstimator, ClassifierMixin):
def __init__(self, base_estimator, unique_parent_labels):
BaseEstimator.__init__(self)
self.unique_parent_labels = unique_parent_labels
self.child_estimators = {p: base_estimator()
for p in unique_parent_labels}
def fit(self, X, y, y_, *args):
X = X.toarray()
for p in self.unique_parent_labels:
child_indices = np.argwhere(np.array(y) == p).flatten()
x_ = X[child_indices, :]
y__ = y_[child_indices]
self.child_estimators[p].fit(x_, y__)
return self
def predict(self, X, *args):
parent_y = X[1][0]
X = X[0].toarray()
return (np.array([parent_y]),
np.array([self.child_estimators[parent_y].predict(X)]))
pipeline = Pipeline([
("tfidf1", TfidfVectorizer()),
("plm", ParentLabelsModel(base_estimator=SGDClassifier)),
("clm", ChildLabelsModel(
base_estimator=SGDClassifier,
unique_parent_labels=np.unique(ytrain))
)
])
As you can see, the first ParentsLabelsModel
is a transformer, so that I can pass the predicted parent labels to select the specific child_estimator
(there is one classifier per parent label).
I’ve been trying to convert both ParentLabelsModel
and ChildLabelsModel
but quite unsuccessfully. Before pasting the code i’ve written for the update_registered_converter
functions, I would like to check if you have any opinion or guidelines to approach this conversion.
Thanks!
Issue Analytics
- State:
- Created 3 years ago
- Comments:8
Top Results From Across the Web
Pipeline architecture - GitLab Docs
Pipelines are the fundamental building blocks for CI/CD in GitLab. This page documents some of the important concepts related to them.
Read more >Use Hierarchical Data | FusionCreator Tutorials
Fig. 129: Import the pipeline definition. Enter the title of the pipeline: Hierarchies . Click Browse, and then locate ...
Read more >Set pipeline permissions - Azure Pipelines - Microsoft Learn
To set default deployment group permissions, open Deployment groups in the Pipelines tab. Then, select Security. Select Security to manage ...
Read more >Team Management - Knowledge Base | Pipeline
Reports to allow teams to report to other teams to preserve multi-level hierarchy. Visibility settings control which records team members ...
Read more >Configuration Hierarchy - Jenkins Templating Engine
The Configuration Hierarchy is created by configuring these Governance Tiers on Folders and in the Jenkins Global Configuration. Pipelines using JTE inherit ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I made a kind of proof of concept about the numpy API for ONNX. It works with decorators. I also wrote a tutorial about it. Here is a current transformer. Method transform is implemented with ONNX operators but the syntax is similar to numpy.
Closing the issue, feel free to reopen it.