parallel=True in nopython mode seems to make memory leak
See original GitHub issueReporting a bug
So I am doing a ‘multiplication’ of some sort. It works fine without ‘parallel=True’ and even though it gives me the right results while in parallel, it eats up my RAM quite quickly.
- macOS 12.4 (MacBook Pro 16 with M1 Pro)
- Python 3.9.13
- numba 0.56.0, numpy 1.19.5, llvmlite 0.39.0
So the first function multiplication
works perfectly fine, but as I am using it over many iterations in my full code, I need to make it much faster.
So I tried to make it run in parallel, but it seems to make a memory leak that makes my (previously working) program crash after some number of iterations.
Here is my simplified code:
import numpy as np
from numba import njit, get_num_threads, prange, float64
from numba.typed import List, Dict
@njit
def multiplication(left, right, operator, length):
result_functional = np.zeros(shape=length, dtype=float64)
for i, left_value in enumerate(left):
for j, right_value in enumerate(right):
for k, count in operator[i][j].items():
result_functional[k] += count * left_value * right_value
return result_functional
@njit(parallel=True)
def pmultiplication(left, right, operator, length):
result_functional = np.zeros(shape=length, dtype=float64)
num_loads = get_num_threads()
load_size = len(left) / num_loads
for n in prange(num_loads):
result_functional += multiplication(left[round(n * load_size): round((n+1) * load_size)], right,
operator[round(n * load_size): round((n+1) * load_size)], length)
return result_functional
@njit
def random_operator(dim, trunc):
side_length = np.sum(dim ** np.arange(trunc + 1))
full_length = np.sum(dim ** np.arange(2*trunc + 1))
numba_shuffle_operator = List()
for i in range(side_length):
numba_list = List()
for j in range(side_length):
num = np.random.randint(low=1, high=(i+j+2))
keys = np.random.randint(low=0, high=full_length, size=num)
values = np.random.randint(low=1, high=num+1, size=num)
numba_dict = Dict()
for k in range(num):
numba_dict[keys[k]] = values[k]
numba_list.append(numba_dict)
numba_shuffle_operator.append(numba_list)
return numba_shuffle_operator
def prepare_multiplication(dim, trunc):
operator = random_operator(dim, trunc)
left = np.random.random(size=len(operator))
right = np.random.random(size=len(operator[0]))
length = np.sum(dim ** np.arange(2 * trunc + 1))
return left, right, operator, length
# # compilation
left, right, operator, length = prepare_multiplication(dim=2, trunc=2)
multiplication(left, right, operator, length)
pmultiplication(left, right, operator, length)
# # parameters that will be used
left, right, operator, length = prepare_multiplication(dim=3, trunc=4)
And here is my code to show how the memory is eaten up:
import psutil
import pandas as pd
from IPython.display import display
num_iter = 100
process = psutil.Process()
cached_rss = np.array([process.memory_info().rss])
for i in range(num_iter):
multiplication(left, right, operator, length)
cached_rss = np.append(cached_rss, process.memory_info().rss)
cached_rss_parallel = np.array([process.memory_info().rss])
for i in range(num_iter):
pmultiplication(left, right, operator, length)
cached_rss_parallel = np.append(cached_rss_parallel, process.memory_info().rss)
memory = pd.DataFrame()
memory["diff since start"] = cached_rss - cached_rss[0]
memory["diff since last"] = 0
memory["diff since last"][1:] = np.diff(cached_rss, n=1)
memory["diff since start (parallel)"] = cached_rss_parallel - cached_rss_parallel[0]
memory["diff since last (parallel)"] = 0
memory["diff since last (parallel)"][1:] = np.diff(cached_rss_parallel, n=1)
pd.options.display.float_format = '{:,.0f}'.format
display(memory.tail(5).astype(float))
print(f'mean: {np.mean(memory["diff since last (parallel)"]):,.2f}')
print(f'standard dev: {np.std(memory["diff since last (parallel)"]):,.2f}')
diff since start | diff since last | diff since start (parallel) | diff since last (parallel) | |
---|---|---|---|---|
96 | 0 | 0 | 3,342,336 | 98,304 |
97 | 0 | 0 | 3,342,336 | 0 |
98 | 0 | 0 | 3,440,640 | 98,304 |
99 | 0 | 0 | 3,440,640 | 0 |
100 | 0 | 0 | 3,522,560 | 81,920 |
mean: 34,876.83 standard dev: 86,869.42
Issue Analytics
- State:
- Created a year ago
- Comments:8 (1 by maintainers)
Top Results From Across the Web
Notes on Numba Runtime
To get the MemInfo from a ndarray allocated by NRT, use the .base attribute. To debug memory leaks in NRT, the numba.runtime.rtsys defines...
Read more >numba jit mode memory leak issue? - python - Stack Overflow
I am currently facing a weird memory leak problem when having jit ... I have also called gc collection but it seems like...
Read more >Compiling Python code with @jit — Numba 0.51.1-py3.7-linux ...
Numba has two compilation modes: nopython mode and object mode. ... @jit(nopython=True, parallel=True) def f(x, y): return x + y. See also.
Read more >Fixing Python Memory Leaks - CloudQuant
It turned out that the numpy array resulting from the above operation was being passed to a numba generator compiled in "nopython" mode....
Read more >3 Troubleshoot Memory Leaks - Java - Oracle Help Center
Obviously, your live set may go up and down, but if you see a steady increase over time, then you could have a...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@louisamand, a patch containing the fix is still under review
This leak seems to still be present on 0.56.3