question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add support for pandas 1.4.0

See original GitHub issue

Looks like the root cause may be that converting a nullable type to Categorical preserves the nullable types in the pandas Categorical categories. On the top is the latest pandas and on the bottom is 1.3.4 pandas.

image

This causes np.asarray calls in our imputers to introduce the new pandas null value place-holder, <NA> which throws off sklearn:

image

_Originally posted by @freddyaboulton in https://github.com/alteryx/evalml/issues/3272#issuecomment-1020261986_

This change in behavior breaks tests in three places:

  • In our simple imputer, we convert NaturalLanguage to Categorical in order to support imputing natural language via the most_frequent strategy. (test_simple_imputer_supports_natural_language_constant)
  • In our imputer tests, we test running the imputer on a dataframe that had a nullable int column converted to categorical (test_imputer_woodwork_custom_overrides_returned_by_components)
  • In our tests for EmailFeaturizer, and URLFeaturizer, this causes the categorical features created from Email and URL logical types to have nullable types in the categories because the physical type for Email and URL is text. See (test_ft_transform_primitive_components.py::test_component_fit_transform[component1-make_data_email_fit_transform_missing_values-make_answer_email_fit_transform_missing_values-make_expected_logical_types_email_fit_transform_missing_values])

Additionally:

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
chukarstencommented, Feb 15, 2022

Looks like the XGBoost warnings are remedied, per https://github.com/dmlc/xgboost/pull/7595! Just need to wait for a release from XGBoost.

1reaction
thehomebrewnerdcommented, Feb 2, 2022

@chukarsten Just a heads up, Featuretools doesn’t yet fully support pandas==1.4.0 either. It might work for you though, but no guarantees at this point.

https://github.com/alteryx/featuretools/issues/1865

Read more comments on GitHub >

github_iconTop Results From Across the Web

Installation — pandas 1.4.0 documentation
The easiest way to install pandas is to install it as part of the Anaconda distribution, a cross platform distribution for data analysis...
Read more >
pandas · PyPI
pandas is a Python package that provides fast, flexible, and expressive data ... NumPy - Adds support for large, multi-dimensional arrays, matrices and ......
Read more >
Could not find a version that satisfies the requirement pandas ...
Upgrade Python or use lower version of pandas. Just pip install pandas should find compatible version. Share.
Read more >
How To Install Pandas In Python? An Easy Step By Step ...
Enter the command “pip install pandas” on the terminal. This should launch the pip installer. The required files will be downloaded, and Pandas...
Read more >
How To Install Pandas In Python 3.10 (Windows 10) - YouTube
how to install pandas in python windows 10In this video I will show you how to install pandas in python 3.10.By the end...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found