question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pd.DataFrame.__deepcopy__ is does not work when elements are lists (nested)

See original GitHub issue

Looks like the pandas deepcopy is no longer fully recursive. For example if we have nested lists as shown below, the most inner lists are actually exactly the same.

>>> x = [[1, 2, 3, 4]]
>>> y = [[2, 3, 4, 5]]
>>> import pandas as pd
>>> df = pd.DataFrame({'x':x, 'y':y})
>>> import copy
>>> df2 = copy.deepcopy(df)
>>> id(df2['x'][0])
4572099912
>>> id(df['x'][0])
4572099912

We first noticed this issue on skbio when the copy unittests were failing. This hasn’t been a problem with the previous pandas release. Looking at the commits in the previous release, this one looks suspicious.

CC @gregcaporaso @ebolyen @jairideout

INSTALLED VERSIONS

commit: None python: 3.5.3.final.0 python-bits: 64 OS: Darwin OS-release: 15.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.20.3 pytest: None pip: 9.0.1 setuptools: 27.2.0 Cython: None numpy: 1.13.1 scipy: 0.19.1 xarray: None IPython: 6.1.0 sphinx: None patsy: 0.4.1 dateutil: 2.6.1 pytz: 2017.2 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.0.2 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 0.999 sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: None

Issue Analytics

  • State:open
  • Created 6 years ago
  • Reactions:1
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

5reactions
jrebackcommented, Sep 1, 2017

embedding mutable objects inside a. DataFrame is an antipattern

1reaction
jrebackcommented, Apr 27, 2021

the community is welcome to contribute a patch

Read more comments on GitHub >

github_iconTop Results From Across the Web

pandas.DataFrame.copy(deep=True) doesn't actually create ...
When deep=True, data is copied but actual Python objects will not be copied recursively, only the reference to the object. This is in...
Read more >
pandas.DataFrame.copy — pandas 1.5.2 documentation
Make a deep copy, including a copy of the data and the indices. With deep=False neither the indices nor the data are copied....
Read more >
How to Copy a List in Python (5 Techniques w/ Examples)
This tutorial will teach you how to copy or clone a list using several ... to the nested elements within it will be...
Read more >
How to Copy List of Lists in Python (Shallow vs Deep)? - Finxter
A shallow copy only copies the references of the list elements. A deep copy copies the list elements themselves which can lead to...
Read more >
11. Shallow and Deep Copy | Python Tutorial
In this chapter, we will cover the question of how to copy lists and nested lists. The problems which we will encounter are...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found