question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] GroupedTransformer can't deal with transformers that use `y`

See original GitHub issue

Hi! First of all thanks for your package, it’s awesome! (Same as calmcode and other resources you built).

I’m trying to perform a LeaveOneOutEncoder transformation for each group of my dataset and I think it’s not possible (or I can’t get it), the problem is that within sklego/meta/grouped_transformer.py the fit is called without y and it throws an error: TypeError: fit() missing 1 required positional argument: 'y'.

I post here a dummy example that has the same functionality that the thing I’m trying to do in my code:

import pandas as pd
import numpy as np

from sklego.datasets import load_heroes
from sklego.meta import GroupedTransformer
from category_encoders import LeaveOneOutEncoder


df_heroes = load_heroes(as_frame=True).replace([np.inf, -np.inf], np.nan).dropna()

# Dummy example
X = df_heroes[['attack_type', 'health', 'attack_spd']]
y = df_heroes['attack']

GroupedTransformer(LeaveOneOutEncoder(), groups=['attack_type']).fit_transform(X, y)

Thansk in advance!

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
MBrounscommented, Apr 21, 2021

It looks like the problem appears here: https://github.com/koaning/scikit-lego/blob/main/sklego/meta/grouped_transformer.py#L90

for all the estimators fitted on the subgroups we pass y as far as I can tell, but the global fallback model is fitted without it. The bit you linked @koaning is in the transform step and although the LeaveOneOutEncoder does use y during transform as well I would expect a different error.

I’m all OK for making the changes to both parts, but I would like to test whether passing along y will actually work for the transform step. I remember from building the TrainOnlyTransformerMixin it was not very clear when y would get passed along to the transform step when running from a pipeline and when it wouldnt.

1reaction
sergiocalde94commented, Jun 10, 2021

Do you need any help with this issue? Is it something missed on my part?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pipelines - Hugging Face
from transformers import pipeline pipe = pipeline("text-classification") def data(): while True: # This could come from a dataset, a database, a queue or...
Read more >
Three Phase Transformer Connections and Basics
The star connection requires the use of three transformers, and if any one transformer becomes fault or disabled, the whole group might become...
Read more >
A Mathematical Framework for Transformer Circuits
Two layer attention-only transformers can implement much more complex algorithms using compositions of attention heads. These compositional ...
Read more >
Transformers - Explaining The Basics - Galco
A transformer is an electrical device which, by the principles of electromagnetic induction, transfers electrical energy from one electric circuit to ...
Read more >
BERTopic: The Future of Topic Modeling | Pinecone
Demystifying BERTopic and how it works with transformers, UMAP, HDBSCAN, ... capabilities of these (not yet sentient) transformer models and uses some other ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found