question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DataFrame.replace() overwrites when values are non-numeric

See original GitHub issue

Code Sample, a copy-pastable example if possible

In [1]: pd.Series([1,2,3]).replace({1 : 2, 2 : 3, 3 : 4})
Out[1]: 
0    2
1    3
2    4
dtype: int64

In [2]: pd.Series(['1','2','3']).replace({'1' : '2', '2' : '3', '3' : '4'})
Out[2]:
0    4
1    4
2    4
dtype: object

Problem description

I’d expect the replacement over values in a dataframe to be non-transitive. Suppose that we would like to replace a with b, and b with c. When this replacement is applied to an entry containing the value a, replacement rules are propagated and therefore c is returned instead of b. Same replacement is not transitive (as shown in example code) for numeric values.

I think this default behavior should be mentioned explicitly in the documentation. It would also be nice to have a Boolean option to set the transitivity on/off.

Expected Output

Out[2]:
0    2
1    3
2    4
dtype: object

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None python: 3.5.2.final.0 python-bits: 64 OS: Darwin OS-release: 16.4.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8

pandas: 0.18.1 nose: 1.3.7 pip: 9.0.1 setuptools: 27.2.0 Cython: 0.25.2 numpy: 1.11.1 scipy: 0.18.1 statsmodels: 0.6.1 xarray: None IPython: 5.1.0 sphinx: 1.4.6 patsy: 0.4.1 dateutil: 2.5.3 pytz: 2016.6.1 blosc: None bottleneck: 1.1.0 tables: 3.2.3.1 numexpr: 2.6.1 matplotlib: 1.5.3 openpyxl: 2.3.2 xlrd: 1.0.0 xlwt: 1.1.2 xlsxwriter: 0.9.3 lxml: 3.6.4 bs4: 4.5.1 html5lib: None httplib2: None apiclient: None sqlalchemy: 1.0.13 pymysql: None psycopg2: None jinja2: 2.8 boto: 2.42.0 pandas_datareader: None

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
chris-b1commented, Apr 18, 2017

xref #5541, #5338

2reactions
chris-b1commented, Apr 18, 2017

Yeah, I’d consider this more a bug than intended behavior. Deep in the replace code, if dtype=object is being replaced on, a recursive path is used, not entirely sure why, but probably could be changed to to do only 1 pass like you’re expecting.

https://github.com/pandas-dev/pandas/blob/2522efa9e687e777d966f49af70b325922699bea/pandas/core/internals.py#L3271

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pandas: Replacing Non-numeric cells with 0 - Stack Overflow
You can use the to_numeric method, but it's not changing the value in place. You need to set the column to the new...
Read more >
pandas.DataFrame.replace — pandas 0.24.2 documentation
For a DataFrame a dict can specify that different values should be replaced in different columns. For example, {'a': 1, 'b': 'z'} looks...
Read more >
Convert factors to numeric variables — as_numeric • sjlabelled
This function converts (replaces) factor levels with the related factor level index number, thus the factor is converted to a numeric variable.
Read more >
pyspark.sql module - Apache Spark
GroupedData Aggregation methods, returned by DataFrame.groupBy() . pyspark.sql.DataFrameNaFunctions Methods for handling missing data (null values).
Read more >
Replacing Non-numeric cells with 0-Pandas,Python
to_numeric sets the non-numeric values to NaNs , and then the chained fillna method replaces the NaNs with zeros. mgig 2025. Source: stackoverflow ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found