question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

parallel=True in nopython mode seems to make memory leak

See original GitHub issue

Reporting a bug

So I am doing a ‘multiplication’ of some sort. It works fine without ‘parallel=True’ and even though it gives me the right results while in parallel, it eats up my RAM quite quickly.

  • macOS 12.4 (MacBook Pro 16 with M1 Pro)
  • Python 3.9.13
  • numba 0.56.0, numpy 1.19.5, llvmlite 0.39.0

So the first function multiplication works perfectly fine, but as I am using it over many iterations in my full code, I need to make it much faster. So I tried to make it run in parallel, but it seems to make a memory leak that makes my (previously working) program crash after some number of iterations.

Here is my simplified code:

import numpy as np
from numba import njit, get_num_threads, prange, float64
from numba.typed import List, Dict

@njit
def multiplication(left, right, operator, length):
    result_functional = np.zeros(shape=length, dtype=float64)
    for i, left_value in enumerate(left):
        for j, right_value in enumerate(right):
            for k, count in operator[i][j].items():
                result_functional[k] += count * left_value * right_value
    return result_functional

@njit(parallel=True)
def pmultiplication(left, right, operator, length):
    result_functional = np.zeros(shape=length, dtype=float64)
    num_loads = get_num_threads()
    load_size = len(left) / num_loads
    for n in prange(num_loads):
        result_functional += multiplication(left[round(n * load_size): round((n+1) * load_size)], right,
                                            operator[round(n * load_size): round((n+1) * load_size)], length)
    return result_functional

@njit
def random_operator(dim, trunc):
    side_length = np.sum(dim ** np.arange(trunc + 1))
    full_length = np.sum(dim ** np.arange(2*trunc + 1))
    numba_shuffle_operator = List()
    for i in range(side_length):
        numba_list = List()
        for j in range(side_length):
            num = np.random.randint(low=1, high=(i+j+2))
            keys = np.random.randint(low=0, high=full_length, size=num)
            values = np.random.randint(low=1, high=num+1, size=num)
            numba_dict = Dict()
            for k in range(num):
                numba_dict[keys[k]] = values[k]
            numba_list.append(numba_dict)
        numba_shuffle_operator.append(numba_list)
    return numba_shuffle_operator


def prepare_multiplication(dim, trunc):
    operator = random_operator(dim, trunc)
    left = np.random.random(size=len(operator))
    right = np.random.random(size=len(operator[0]))
    length = np.sum(dim ** np.arange(2 * trunc + 1))
    return left, right, operator, length

# # compilation
left, right, operator, length = prepare_multiplication(dim=2, trunc=2)
multiplication(left, right, operator, length)
pmultiplication(left, right, operator, length)

# # parameters that will be used
left, right, operator, length = prepare_multiplication(dim=3, trunc=4)

And here is my code to show how the memory is eaten up:

import psutil
import pandas as pd
from IPython.display import display

num_iter = 100
process = psutil.Process()

cached_rss = np.array([process.memory_info().rss])
for i in range(num_iter):
    multiplication(left, right, operator, length)
    cached_rss = np.append(cached_rss, process.memory_info().rss)

cached_rss_parallel = np.array([process.memory_info().rss])
for i in range(num_iter):
    pmultiplication(left, right, operator, length)
    cached_rss_parallel = np.append(cached_rss_parallel, process.memory_info().rss)

memory = pd.DataFrame()
memory["diff since start"] = cached_rss - cached_rss[0]
memory["diff since last"] = 0
memory["diff since last"][1:] = np.diff(cached_rss, n=1)
memory["diff since start (parallel)"] = cached_rss_parallel - cached_rss_parallel[0]
memory["diff since last (parallel)"] = 0
memory["diff since last (parallel)"][1:] = np.diff(cached_rss_parallel, n=1)

pd.options.display.float_format = '{:,.0f}'.format
display(memory.tail(5).astype(float))
print(f'mean: {np.mean(memory["diff since last (parallel)"]):,.2f}')
print(f'standard dev: {np.std(memory["diff since last (parallel)"]):,.2f}')

diff since start diff since last diff since start (parallel) diff since last (parallel)
96 0 0 3,342,336 98,304
97 0 0 3,342,336 0
98 0 0 3,440,640 98,304
99 0 0 3,440,640 0
100 0 0 3,522,560 81,920

mean: 34,876.83 standard dev: 86,869.42

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:8 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
guilhermeleobascommented, Nov 7, 2022

@louisamand, a patch containing the fix is still under review

0reactions
louisamandcommented, Nov 7, 2022

This leak seems to still be present on 0.56.3

Read more comments on GitHub >

github_iconTop Results From Across the Web

Notes on Numba Runtime
To get the MemInfo from a ndarray allocated by NRT, use the .base attribute. To debug memory leaks in NRT, the numba.runtime.rtsys defines...
Read more >
numba jit mode memory leak issue? - python - Stack Overflow
I am currently facing a weird memory leak problem when having jit ... I have also called gc collection but it seems like...
Read more >
Compiling Python code with @jit — Numba 0.51.1-py3.7-linux ...
Numba has two compilation modes: nopython mode and object mode. ... @jit(nopython=True, parallel=True) def f(x, y): return x + y. See also.
Read more >
Fixing Python Memory Leaks - CloudQuant
It turned out that the numpy array resulting from the above operation was being passed to a numba generator compiled in "nopython" mode....
Read more >
3 Troubleshoot Memory Leaks - Java - Oracle Help Center
Obviously, your live set may go up and down, but if you see a steady increase over time, then you could have a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found