question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: cannot safely cast non-equivalent float64 to Int64

See original GitHub issue
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(3,4), columns= list('ABCD'))
df_int = pd.to_numeric(df['A'], errors='coerce').astype('Int64')

Problem description

cannot safely cast non-equivalent float64 to Int64, it should happen like when you convert from float64 to int64, which is rounding down the number

Expected Output

0 0 1 0 2 0 Name: A, dtype: Int64

Output of pd.show_versions()

pandas : 1.0.1

You can pass to Int64 safely by doing:

df_int = np.floor(pd.to_numeric(df['A'], errors='coerce')).astype('Int64')

TypeError Traceback (most recent call last) ~\anaconda3\lib\site-packages\pandas\core\arrays\integer.py in safe_cast(values, dtype, copy) 143 try: –> 144 return values.astype(dtype, casting=“safe”, copy=copy) 145 except TypeError:

TypeError: Cannot cast array from dtype(‘float64’) to dtype(‘int64’) according to the rule ‘safe’

During handling of the above exception, another exception occurred:

TypeError Traceback (most recent call last) <ipython-input-14-8151dc1b5846> in <module> ----> 1 df_int = pd.to_numeric(df[‘A’], errors=‘coerce’).astype(‘Int64’) 2 df_int

~\anaconda3\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors) 5696 else: 5697 # else, only a single dtype is given -> 5698 new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors) 5699 return self._constructor(new_data).finalize(self) 5700

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in astype(self, dtype, copy, errors) 580 581 def astype(self, dtype, copy: bool = False, errors: str = “raise”): –> 582 return self.apply(“astype”, dtype=dtype, copy=copy, errors=errors) 583 584 def convert(self, **kwargs):

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in apply(self, f, filter, **kwargs) 440 applied = b.apply(f, **kwargs) 441 else: –> 442 applied = getattr(b, f)(**kwargs) 443 result_blocks = _extend_blocks(applied, result_blocks) 444

~\anaconda3\lib\site-packages\pandas\core\internals\blocks.py in astype(self, dtype, copy, errors) 623 vals1d = values.ravel() 624 try: –> 625 values = astype_nansafe(vals1d, dtype, copy=True) 626 except (ValueError, TypeError): 627 # e.g. astype_nansafe can fail on object-dtype of strings

~\anaconda3\lib\site-packages\pandas\core\dtypes\cast.py in astype_nansafe(arr, dtype, copy, skipna) 819 # dispatch on extension dtype if needed 820 if is_extension_array_dtype(dtype): –> 821 return dtype.construct_array_type()._from_sequence(arr, dtype=dtype, copy=copy) 822 823 if not isinstance(dtype, np.dtype):

~\anaconda3\lib\site-packages\pandas\core\arrays\integer.py in _from_sequence(cls, scalars, dtype, copy) 348 @classmethod 349 def _from_sequence(cls, scalars, dtype=None, copy=False): –> 350 return integer_array(scalars, dtype=dtype, copy=copy) 351 352 @classmethod

~\anaconda3\lib\site-packages\pandas\core\arrays\integer.py in integer_array(values, dtype, copy) 129 TypeError if incompatible types 130 “”" –> 131 values, mask = coerce_to_array(values, dtype=dtype, copy=copy) 132 return IntegerArray(values, mask) 133

~\anaconda3\lib\site-packages\pandas\core\arrays\integer.py in coerce_to_array(values, dtype, mask, copy) 245 values = safe_cast(values, dtype, copy=False) 246 else: –> 247 values = safe_cast(values, dtype, copy=False) 248 249 return values, mask

~\anaconda3\lib\site-packages\pandas\core\arrays\integer.py in safe_cast(values, dtype, copy) 150 151 raise TypeError( –> 152 f"cannot safely cast non-equivalent {values.dtype} to {np.dtype(dtype)}" 153 ) 154

TypeError: cannot safely cast non-equivalent float64 to int64

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:11 (2 by maintainers)

github_iconTop GitHub Comments

7reactions
efagerbergcommented, Aug 13, 2021

@efagerberg NaN is a float value, so the column gets casted to float64

The problem is there is a NaN equivalent for Int64 as well and in some cases like when no other data has a decimal place, it is inappropriate to assume float.

If this is expected result then fine, it is not a bug. However it does seem like it is at minimum unexpected for the situation I describe above and should be better.

4reactions
efagerbergcommented, May 24, 2022

Agreed, that the current error TypeError: cannot safely cast non-equivalent float64 to int64 is the expected behavior. The workaround you have is the suitable “workaround” for what you exactly want. Closing

I think a more honest answer from the maintainers is to accept this as an unexpected result that makes working with pandas harder.

There can be legitimate rational for not working on it for the past 2 years. The way this issue was closed feels more like the maintainers would rather sweep it under the rug, since the comment preceding the closing of this issue did not really address any of the criticism of earlier comment.

Those critiques being:

  1. It is odd for a user who has a column of mostly whole numbers and one empty cell to get back a column of type float. There can be legitimate reasons why (NA is a float), but it is odd regardless.
  2. Existing utilities that are supposed to facilitate the conversion from float to Int64, raise errors instead. For example read_csv dtype keyword arguments. Int64 is a nullable integer type and thus should be convertable from float if the floats have no decimal values. This to me is the clearest point that this is in fact a bug not something more suitable for a feature request.

Now to be fair to the maintainers, the original issue creator’s problem seems to be that they are trying to cast float values that are not whole numbers to Int64, in that case the error message makes sense. I think it would be reasonable to also suggest this be a new bug issue for clarity too.

Read more comments on GitHub >

github_iconTop Results From Across the Web

TypeError: cannot safely cast non-equivalent float64 to int64 ...
No need to replace nan . You can pass to Int64 safely by doing: df['A'] = np.floor(pd.to_numeric(df['A'], errors='coerce')).astype('Int64').
Read more >
TypeError: cannot safely cast non-equivalent float64 to int64 ...
Pandas : How can I resolve - TypeError: cannot safely cast non-equivalent float64 to int64 ?
Read more >
cannot safely cast non-equivalent float64 to int64
1. # try this: .round(0) ; 2. # set copy=False to modify inplace ; 4. ​.
Read more >
How To Safely Roundandclamp From Float64 To Int64?
But I get the following error: TypeError: cannot safely cast nonequivalent float64 to int64.This is my code import pandas as pd import numpy...
Read more >
TypeError: cannot safely cast non-equivalent float64 to int64 ...
Coding example for the question How can I resolve - TypeError: cannot safely cast non-equivalent float64 to int64?-pandas.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found