question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Memory leak when array contains circular references

See original GitHub issue

Memory is leaked when an array contains a circular reference:

>>> import gc
>>> import sys
>>> import numpy as np
>>> class Circular(object): pass
...
>>> c = Circular()
>>> c.arr = np.array([c])
>>> del c
>>> gc.collect()
0
>>> gc.collect()
0
>>> print [ repr(o) for o in gc.get_objects() if type(o) == Circular ]
['<__main__.Circular object at 0x100e97f10>']
>>> print [ sys.getrefcount(o) for o in gc.get_objects() if type(o) == Circular ]
[2]

This is because PyArray doesn’t implement tp_traverse (for fairly reasonable reasons)… but also leads to hard-to-track-down memory leaks.

At first pass it seems reasonable to implement a tp_traverse which only traverses if dtype=object… but that does have a performance tradeoff.

Issue Analytics

  • State:open
  • Created 8 years ago
  • Reactions:1
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
jpivarskicommented, Jan 14, 2022

I came here because I tracked down a memory leak in NumPy, then found out that it’s a known issue. At least I can leave my reproducer and start watching this issue.

Using pympler:

Python 3.10.1 | packaged by conda-forge | (main, Dec 22 2021, 01:39:36) [GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import gc
>>> import numpy as np
>>> import pympler.tracker
>>> 
>>> np.__version__
'1.22.0'
>>> 
>>> class Something:
...     __slots__ = ["ref"]   # slots only makes it grow more slowly than a __dict__
...     def __init__(self, ref):
...         self.ref = ref
... 
>>> tracker = pympler.tracker.SummaryTracker()
>>> tracker.print_diff()
                     types |   # objects |   total size
========================== | =========== | ============
... (calibate to get the state before the test) ...
>>> 
>>> array = np.empty(100, dtype="O")
>>> for i in range(len(array)):
...     array[i] = Something(array)
... 
>>> del array
>>> 
>>> gc.collect()
0
>>> 
>>> tracker.print_diff()   # the array and 100 Somethings are still there
               types |   # objects |   total size
==================== | =========== | ============
  __main__.Something |         100 |      3.91 KB
       numpy.ndarray |           1 |    912     B
                list |           2 |    160     B
                 str |           2 |    143     B
                code |           0 |     70     B

Not using pympler, just watching system memory (with free) interactively:

def fill():
    array = np.empty(100000, dtype="O")
    for i in range(len(array)):
        array[i] = Something(array)
    del array

for i in range(100000):
    fill()
    noprint = gc.collect()

Within about a minute, you end up using a GB of RAM, and it keeps going until you stop it, then resets to baseline when you shut down Python.

0reactions
sebergcommented, Sep 24, 2021

The PR has stalled for a while, although I think it is a bug to not implement circular refcounting. I think it is still relevant, getting someone outside of the core team in to confirm (or remove) the untracking behaviour might help it along a large bit I think. There was once the question about speed impact. I honestly don’t think it should matter, but if anyone is concerned trying to get a check could help as well.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How can circular references cause memory leakage in ...
Circular reference​​ In hybrid system, where reference counting and garbage collection are used, memory leaks will occur because the system fails ...
Read more >
Resolving Circular Reference Related Memory Leaks ... - DZone
Fortunately, there are some solutions to circular reference related memory leaks. The most straightforward approach is to assign the null value ...
Read more >
Do circular references in C# cause memory leaks? [duplicate]
I want to create C# classes to represent this data objects, but it'll result in circular references. Here is a simple example class...
Read more >
Event Handlers, Circular References, and Alleged Memory ...
So as you can see, the presence of a circular reference is not sufficient to cause a leak in a garbage collected environment....
Read more >
Memory leaks due to circular referencing in PHP - Educative.io
The most reliable way to resolve circular referencing is to restructure your code. For example, if your classes depend on one another of...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found