question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: Replacing NaN with None in Pandas 1.3 does not work

See original GitHub issue
  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of pandas.
  • (optional) I have confirmed this bug exists on the master branch of pandas.

Code Sample, a copy-pastable example

Pandas 1.2

>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame([0.5, np.nan])
>>> df.where(pd.notnull(df), None)
     0
0  0.5
1  None

Pandas 1.3

>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame([0.5, np.nan])
>>> df.where(pd.notnull(df), None)
     0
0  0.5
1  NaN

Problem description

Replacing NaN values with None (or any other Python object) should work as in previous Pandas versions.

Expected Output

>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame([0.5, np.nan])
>>> df.where(pd.notnull(df), None)
     0
0  0.5
1  None

Output of pd.show_versions()

INSTALLED VERSIONS

commit : f00ed8f47020034e752baf0250483053340971b0 python : 3.8.10.final.0 python-bits : 64 OS : Darwin OS-release : 21.0.0 Version : Darwin Kernel Version 21.0.0: Sun Jun 20 18:43:49 PDT 2021; root:xnu-8011.0.0.121.4~2/RELEASE_X86_64 machine : x86_64 processor : i386 byteorder : little LC_ALL : None LANG : es_ES.UTF-8 LOCALE : es_ES.UTF-8

pandas : 1.3.0 numpy : 1.18.5 pytz : 2021.1 dateutil : 2.8.1 pip : 21.1.3 setuptools : 57.1.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : 1.3.7 lxml.etree : 4.6.2 html5lib : None pymysql : None psycopg2 : 2.9.1 (dt dec pq3 ext lo64) jinja2 : 2.11.3 IPython : 7.20.0 pandas_datareader: None bs4 : 4.9.3 bottleneck : None fsspec : None fastparquet : None gcsfs : None matplotlib : 3.3.4 numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : 3.0.0 pyxlsb : None s3fs : None scipy : 1.3.3 sqlalchemy : 1.4.20 tables : None tabulate : None xarray : None xlrd : 2.0.1 xlwt : None numba : None

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:6
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

10reactions
leromecommented, Jul 7, 2021

That was apparently changed on purpose: https://github.com/pandas-dev/pandas/pull/39761 but it’s indeed not very convenient, for example we often want to replace NaN by None just before converting the dataframe back to a dict (df.to_dict) or other data structure. I’m pretty sure that this sole change breaks many existing code bases.

1reaction
jbrockmendelcommented, Jul 9, 2021

One can still convert to None, however you need to convert to object dtype first

This is the right way to do this.

i was asking for a workaround without converting to object dtype since like you said it involves performance degradation

The 1.2 behavior referenced by the OP also converts to object, so if you want None in your Series, there is no way to do this.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pandas does not convert NaN to None properly
I have tested this on another machine with Python 3.8.5 and pandas==1.1.1, and it worked as expected. Is this a bug? Thank you!...
Read more >
Working with missing data — pandas 1.3.5 documentation
While NaN is the default missing value marker for reasons of computational speed and convenience, we need to be able to easily detect...
Read more >
Part 3 - Introduction to Pandas | ArcGIS API for Python
By passing errors=coerce , the function replaces the invalid “No Value” value with a NaN . To ensure the change gets applied to...
Read more >
Upgrade Pandas Version to Latest or Specific Version
If you don't have PATH setup for python/pip, you will get an error; so make sure you have PATH set-up on Linux and...
Read more >
10 tricks for converting Data to a Numeric Type in Pandas
The error shows it's the problem with the value 'a' as it cannot be converted to an integer. In order to get around...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found