Memory issue?
See original GitHub issueI’ve run into an issue with large matrices and memory. There seem to be two problems:
- Memory isn’t being released on successive calls of grad. e.g.
a = 10000
b = 10000
A = np.random.randn(a)
B = np.random.randn(b)
def fn(x):
M = A[:, na] + x[na, :]
return M[0, 0]
g = grad(fn)
for i in range(100):
g(B)
is ramping up memory on each iteration.
- Memory isn’t being released during the backwards pass e.g.
k = 10
def fn(x):
res = 0
for i in range(k):
res = res + np.sum(x)
return res
g = grad(fn)
b = 200000
g(np.random.randn(b))
This seems to scale in memory (for each call) as O(k), which don’t think is the desired behaviour. For b=150000 this effect does not happen, however.
Issue Analytics
- State:
- Created 7 years ago
- Comments:16 (12 by maintainers)
Top Results From Across the Web
Memory loss: When to seek help - Mayo Clinic
Many medical problems can cause memory loss or other dementia-like symptoms. Most of these conditions can be treated. Your doctor can screen you...
Read more >Forgetfulness — 7 types of normal memory problems
Forgetfulness — 7 types of normal memory problems · 1. Transience. This is the tendency to forget facts or events over time. ·...
Read more >Memory Problems: What is Normal Aging and What is Not?
Simple forgetfulness (the “missing keys”) and delay or slowing in recalling names, dates, and events can be part of the normal process of...
Read more >Memory Loss (Short- and Long-Term): Causes and Treatments
If you find that you are increasingly forgetful or if memory problems interfere with your daily life, schedule an appointment with your doctor ......
Read more >Memory Loss - Symptoms and Causes - Penn Medicine
Hydrocephalus (fluid collection in the brain); Multiple sclerosis; Dementia. Sometimes, memory loss occurs with mental health problems, such as: After a major, ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
We just did some quick experiments and it seems that calling
import gc; gc.collect()
explicitly does seem to prevent prevent memory from building up.Also, we think we identified the problem: originally, these lines in
backward_pass
ensured that there were no more circular dependencies when the backward pass finished by popping things off the tape, so that reference counting would be sufficient to clean up garbage. However, to support efficient jacobians I added this line:
which had the unintended result of preserving circular dependencies and hence preventing reference counting cleanup.
Still, after
tape
goes out of scope (whengrad
returns),gc.collect()
should be able to garbage collect these things, and so adding explicit calls togc.collect()
to user-level code after callinggrad
should work. (We could also add it to the definition ofgrad
.)We’re going to implement a fix (and probably eliminate circular references with tapes in the long term), but for now if you’re having memory problems try calling
gc.collect()
in your code after computing a gradient.I’ve been using the WeakKeyDict without issue for a few months now, ever since it was first proposed earlier in this thread. Also, I tried
py.test
inautograd/test
with and without WeakKeyDict a few days ago, the output of the unit tests was the same.But I am happy with your solution – WeakKeyDict is bit too mysterious/magical for my tastes, and explicitly doing the dereferencing seems less likely to cause errors in the future.