question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TypeError: ufunc 'isnan' not supported for the input types

See original GitHub issue

Code to reproduce the error

df = pd.read_csv('/home/shahul/Downloads/train.csv.zip').sample(10000)
y = df['target']
X = df.drop(['target'],axis=1)

a = AutoML(total_time=30,tuning_mode="Normal")

a.fit(X, y)

the error happens due to the use of np.isna() to object dtype which happens in np.nanmedian() used in

/mljar-supervised/supervised/preprocessing/preprocessing_utils.py

here I have used PNB Paribas dataset as train

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:13 (13 by maintainers)

github_iconTop GitHub Comments

1reaction
pplonskicommented, Aug 29, 2020

@abtheo thank you for explanations! I see the problem right now.

0reactions
abtheocommented, Aug 29, 2020

Practically, to handle objects of unknown content, I see three main use cases we need to cover:

#To be encoded as CATEGORICAL
1. strs_as_object = np.array(["A", "B", "C"], dtype=object)

#To be encoded as DISCRETE (or equivalently, CONTINUOUS for floats)
2. nums_as_object = np.array([1, 2, "3"], dtype=object)

#To be encoded as CATEGORICAL
3. mixed_input_object = np.array([1, "B", 3], dtype=object)

Currently, Cases 1&2 work as expected, however we are not handling Case 3. The ambiguous object type causes Numpy issues all over the place.

As one example, using Case 3 as the input to AutoML.fit(y=mixed_input_object) causes the following error to occur at Line 44 of preprocessing_utils.py :

unique_cnt = len(np.unique(x[~pd.isnull(x)]))

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
TypeError: '<' not supported between instances of 'str' and 'int'

As another example, using Case 3 as the input to AutoML.fit(x=mixed_input_object) causes problems with to_parquet(), as seen here: https://github.com/pandas-dev/pandas/issues/21228

To solve both of these issues, the very first thing we should do is validate the type of the data. Here is my proposed solution:

y_train_type = PreprocessingUtils.get_type(y_train)

if y_train_type == PreprocessingUtils.DISCRETE or y_train_type == PreprocessingUtils.CONTINUOUS:
            y_train = pd.to_numeric(y_train, errors='coerce')

if y_train_type == PreprocessingUtils.CATEGORICAL:
            y_train = pd.Series([str(y) for y in y_train], name="target")
Read more comments on GitHub >

github_iconTop Results From Across the Web

TypeError: ufunc 'isnan' not supported for the input types, and ...
TypeError : ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced ; as np filename =...
Read more >
ufunc 'isnan' not supported for the input types (0.29.3 ... - GitHub
Hi, We are using the LGBMRegressor with categorical data types ... TypeError: ufunc 'isnan' not supported for the input types (0.29.3) #677.
Read more >
How to deal with TypeError: ufunc 'isnan' not supported for the ...
I have dealt with all the Nan values in the features dataframe, then why I am still getting this error? sns.heatmap(features, annot ...
Read more >
TypeError: ufunc 'isnan' not supported ... and so on with pandas
TypeError : ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according...
Read more >
Pandas : TypeError: ufunc 'isnan' not supported for the input ...
Pandas : TypeError : ufunc ' isnan' not supported for the input types, - seaborn Heatmap [ Beautify Your Computer ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found