question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Augment InvalidTargetDataCheck to check for mismatched indices in target and training data

See original GitHub issue

This fell out of https://github.com/alteryx/evalml/issues/1723. My guess is that this would fit best in InvalidTargetDataCheck but open to suggestions.

We’ve seen a few cases where indices cause bugs (imputer, encoder, etc). It could be a good idea to warn users if their data has mismatched indices, since that could leave to unexpected behavior. For example, pandas could try to backfill some indices and introduce NaN and cause errors while modelling.

@freddyaboulton FYI 😀

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
freddyaboultoncommented, Feb 10, 2021

@angela97lin Yea, this isn’t blocked by #1723!

1reaction
angela97lincommented, Feb 9, 2021

@freddyaboulton Correct me if I’m wrong, but I don’t think this is is blocked on #1723, which I noticed you picked up? If that’s true, I can pick this up and work on it concurrently 😃)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Db2 12 - Messages - DSNU1567I - IBM
The number of LOB data pages that are processed as a group for the target and source LOB table spaces differ. This error...
Read more >
Search multiple data streams and indices - Elastic
To search all data streams and indices in a cluster, omit the target from the request path. Alternatively, you can use _all or...
Read more >
Machine Learning: Target Feature Label Imbalance Problems ...
The goal of this post is to teach python programmers why they must have balanced data for model training and how to balance...
Read more >
EvalML Documentation - Alteryx
EvalML is an AutoML library that builds, optimizes, and evaluates machine learning pipelines using domain-specific objective functions. Combined ...
Read more >
Create and use an index to improve performance
You can create indexes that are based on a single field or on multiple fields. You'll probably want to index fields that you...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found