Memory leak in `xs`?
See original GitHub issueCode Sample, a copy-pastable example if possible
import itertools
import pandas as pd
import gc
import random
names = ['a','b','c','d']
levels = [range(10) for x in names]
def go():
full_index = pd.MultiIndex.from_product(levels, names=names)
full_frame = pd.DataFrame(index=full_index, columns=['fail', 'succeed']).apply(makefake, axis=1)
for choice in itertools.product(*levels):
gc.collect()
full_frame.xs(choice, level=names)
def makefake(row):
row['fail'] = random.random()
row['succeed'] = random.random()
return row
go()
Problem description
The above code seems to exhibit a memory leak. I would expect memory usage to plateau over time.
Note: We receive a lot of issues on our GitHub tracker, so it is very possible that your issue has been posted before. Please check first before submitting so that we do not have to handle and close duplicates!
Note: Many problems can be resolved by simply upgrading pandas
to the latest version. Before submitting, please check if that solution works for you. If possible, you may want to check if master
addresses this issue, but that is not necessary.
For documentation-related issues, you can check the latest versions of the docs on master
here:
https://pandas-docs.github.io/pandas-docs-travis/
If the issue has not been resolved there, go ahead and file it in the issue tracker.
Expected Output
Output of pd.show_versions()
[paste the output of pd.show_versions()
here below this line]
/usr/local/lib/python3.6/dist-packages/psycopg2/init.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use “pip install psycopg2-binary” instead. For details see: http://initd.org/psycopg/docs/install.html#binary-install-from-pypi.
“”")
INSTALLED VERSIONS
commit: None python: 3.6.6.final.0 python-bits: 64 OS: Linux OS-release: 4.15.0-38-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: C.UTF-8 LANG: C.UTF-8 LOCALE: en_US.UTF-8
pandas: 0.23.4 pytest: None pip: 18.0 setuptools: 40.0.0 Cython: 0.25.2 numpy: 1.15.2 scipy: 1.1.0 pyarrow: None xarray: None IPython: 7.0.1 sphinx: None patsy: 0.5.0 dateutil: 2.7.3 pytz: 2018.5 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 3.0.0 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 1.0.1 sqlalchemy: None pymysql: None psycopg2: 2.7.5 (dt dec pq3 ext lo64) jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None
Issue Analytics
- State:
- Created 5 years ago
- Comments:7 (3 by maintainers)
Top GitHub Comments
Updating numpy to 1.15.4 fixed the problem.
I’m just using memory_profiler, invoking
mprof run myscript.py
. Afterwards, I view the plot withmprof plot
.I’ll check this against
master
first thing in the morning!