question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Code Sample, a copy-pastable example if possible

Problem description

I find the behavior of SettingWithCopyWarning quite surprising, but I guess that’s just what it is. It would be great if you could document is_copy and how to use it, though.

Whenever any function returns a dataframe, it seems like it should make sure that is_copy is set to False (or None?) so the user doesn’t get a warning if they change it - if you’re returning a dataframe, it’s unlikely that the user expects this to be a view, and you’re not doing chained assignments.

The is_copy attribute has an empty docstring in the docs and I couldn’t find any explanation of it on the website (via google). The only think that told me that overwriting this attribute is actually the right thing to do (again, which is pretty weird to me), was https://github.com/pandas-dev/pandas/issues/6025#issuecomment-32904245

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

4reactions
amuellercommented, Dec 18, 2017

This was about train_test_split in sklearn, see https://github.com/scikit-learn/scikit-learn/issues/8723

But basically the pattern is:

# library code:

def discard_less_than_zero(df):
    return df[df.A >= 0] 

# user code
df = pd.DataFrame({'A':[1,2,3]})
df2 = discard_less_than_zero(df)
df2['B'] = 2

This is of course a contrived example, but I think the same applies whenever you have a library method that returns a sliced dataframe. If copy is the canonical solution, that’s fine.

# library code:

def discard_less_than_zero(df):
    return df[df.A >= 0].copy()

should do it. It just seems conceptually odd. If I understand the warning correctly, this means df[df.A >= 0] is copied twice, right? It warns me that df[df.A >= 0] is a copy, and to get rid of that warning I copy it. (unless .copy() doesn’t actually copy?).

If df is on the order of magnitude of the free memory, doing an additional copy can mean not being able to work on certain datasets.

And regarding the deprecation, I’m not married to any method. I just want a canonical way to solve the issue I described above, ideally without making unnecessary copies. I phrased the issue the way I did because the only information I could find was https://github.com/pandas-dev/pandas/issues/6025#issuecomment-32904245, in which @jreback suggests using is_copy, so I thought this was the canonical way of doing this.

0reactions
jrebackcommented, Dec 18, 2017

Canoncially, this is very easy to work with, simply .copy() after a filter assignment. Agreed that this is not the most intuitive things, but there are many edge cases; copy-on-write fixes this but won’t be available in pandas1.

In [1]: df = pd.DataFrame({'A':[1,2,3]})

In [2]: df2 = df[df.A>2]

In [3]: df2['B'] = 2
/Users/jreback/miniconda3/envs/pandas/bin/ipython:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  #!/Users/jreback/miniconda3/envs/pandas/bin/python

In [4]: df2
Out[4]: 
   A  B
2  3  2

use .copy() (or .assign())

In [1]: df = pd.DataFrame({'A':[1,2,3]})

In [2]: df2 = df[df.A>2].copy()

In [3]: df2['B'] = 2
Read more comments on GitHub >

github_iconTop Results From Across the Web

Document: copy event - Web APIs - MDN Web Docs - Mozilla
Document : copy event. The copy event fires when the user initiates a copy action through the browser's user interface.
Read more >
Document & Copy Printing | Printing Services
Staples document printing can handle all your project requirements with ease, from presentations to detailed blueprints. Same day in-store pickup available.
Read more >
B&W Copies | Color Copies and Quick Prints
Document copies ? Need quick prints? We got you covered with all your printing needs. ... Select from our comprehensive list of printing...
Read more >
Copy a page
If your Word document has multiple pages, the best way to copy a single page is to manually select and copy the text...
Read more >
How to make a copy of a Word document: complete guide
Learn how to make a copy of a Word document online, on Windows, Linux, and Mac. Save a backup copy of documents, edit...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found