question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

df.loc[:, 'col'] returning a view, but df.loc[df.index, 'col'] returning a copy

See original GitHub issue

Code Sample, a copy-pastable example if possible

x = pd.DataFrame(zip(range(4), range(4)), columns=['a', 'b'])
print(x)
   a  b
0  0  0
1  1  1
2  2  2
3  3  3

q = x.loc[:, 'a']
q += 2
print(x)
   a  b
0  2  0
1  3  1
2  4  2
3  5  3

x = pd.DataFrame(zip(range(4), range(4)), columns=['a', 'b'])
print(x)
   a  b
0  0  0
1  1  1
2  2  2
3  3  3

q = x.loc[x.index, 'a']
q += 2
print(x)
   a  b
0  0  0
1  1  1
2  2  2
3  3  3

Problem description

[df.loc[:, ‘col’] returning a view, but df.loc[df.index, ‘col’] returning a copy, intended? how can I make sure it is returning a copy?]

Expected Output

I thought .loc[] was always returning a copy

pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: None pandas: 0.18.1 nose: 1.3.7 pip: 8.1.2 setuptools: 27.2.0 Cython: 0.24.1 numpy: 1.11.1 scipy: 0.18.1 statsmodels: 0.6.1 xarray: None IPython: 5.1.0 sphinx: 1.4.6 patsy: 0.4.1 dateutil: 2.5.3 pytz: 2016.6.1 blosc: None bottleneck: 1.1.0 tables: 3.2.2 numexpr: 2.6.1 matplotlib: 1.5.3 openpyxl: 2.3.2 xlrd: 1.0.0 xlwt: 1.1.2 xlsxwriter: 0.9.3 lxml: 3.6.4 bs4: 4.5.1 html5lib: None httplib2: None apiclient: None sqlalchemy: 1.0.13 pymysql: None psycopg2: None jinja2: 2.8 boto: 2.42.0 pandas_datareader: None

Issue Analytics

  • State:open
  • Created 7 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
jrebackcommented, Mar 9, 2017

this is as expected (see https://github.com/pandas-dev/pandas/issues/6149), but see below.

df.loc[:, columns] is treated as df[columns] which may return a view.

df.loc[indexer, columns] also may return return a view, but almost always does not in practice.

yes if indexer is df.index we could treat this as the former situation (IOW the indexer is exactly equals to the index of the frame). (its just a indexer.equals(df.index) type of comparison.

So i’ll mark this as a compat issue if you’d do a pull-request would be great. This is actually a very small change, see https://github.com/pandas-dev/pandas/blob/master/pandas/core/indexing.py#L503

0reactions
jorisvandenbosschecommented, Oct 30, 2020

As noted in the PRs for 1.1.4 that reverted the fix for this, we should actually investigate what the expected behaviour is, because it’s not clear that the test that was added in https://github.com/pandas-dev/pandas/pull/34996 is actually correct (cc @jbrockmendel)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Returning a copy versus a view warning when using Python ...
Try instead to create a new column in df - you'll get the warning, but the column will exist in your slice. It...
Read more >
Indexing and selecting data — pandas 1.5.2 documentation
pandas aligns all AXES when setting Series and DataFrame from .loc , and .iloc . This will not modify df because the column...
Read more >
SettingWithCopyWarning in Pandas: Views vs Copies
Here, you don't receive a SettingWithCopyWarning and df isn't changed because df.loc[["a", "c", "e"]] uses a list of indices and returns a copy,...
Read more >
SettingwithCopyWarning: How to Fix This Warning in Pandas
Either a view or a copy could be returned when you index a pandas data structure, which means get operations on a DataFrame...
Read more >
Views and Copies in pandas - Practical Data Science
import pandas as pd import numpy as np df = pd. ... .org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy my_slice.iloc[1] = 2.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found