Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Temporal memory leak

See original GitHub issue

Description

Hello, I love Cupy and would like to thank you for all the hard work you’re doing.

It seems that type conversion plus some operations causes temporal memory leak, although the deleting all variables free the memory. Please see Code 2. is this expected behavior? I hope this issue is reproducible.

To Reproduce

import cupy  as cp

mempool = cp.get_default_memory_pool()
pinned_mempool = cp.get_default_pinned_memory_pool()
shape1 = (128,1024,1024)
shape2 = (2,2)

#---------------------------------------------
# Code #1 (This code is OK)
arr1 = cp.ones(shape1)
arr1_float= arr1.astype('float32')

arr2 = cp.ones(shape2)

del arr1_float,arr1
mempool.free_all_blocks()
pinned_mempool.free_all_blocks()
print(mempool.used_bytes())              # 512
print(mempool.total_bytes())             # 512
print(pinned_mempool.n_free_blocks())    # 0

del arr2
mempool.free_all_blocks()
pinned_mempool.free_all_blocks()
print(mempool.used_bytes())              # 0
print(mempool.total_bytes())             # 0
print(pinned_mempool.n_free_blocks())    # 0

#---------------------------------------------


#---------------------------------------------
# Code #2 (This code causes a temporal memory leak?)

arr1 = cp.ones(shape1)
arr1_float = arr1.astype('float32')-1   # adding -1 to Code #1

arr2 = cp.ones(shape2)

del arr1_float,arr1
mempool.free_all_blocks()
pinned_mempool.free_all_blocks()
print(mempool.used_bytes())              # 512
print(mempool.total_bytes())             # 536870912    There is a large memory consumption in spite of deleting arr1_float and arr1.
print(pinned_mempool.n_free_blocks())    # 0

del arr2
mempool.free_all_blocks()
pinned_mempool.free_all_blocks()
print(mempool.used_bytes())              # 0
print(mempool.total_bytes())             # 0
print(pinned_mempool.n_free_blocks())    # 0   Once all variables are deleted, the memory is freed.
#---------------------------------------------

#---------------------------------------------
# Code #3 (This code is OK)


arr1 = cp.ones(shape1)
arr1_float = arr1.astype('float32')-1  

#  without arr2

del arr1_float,arr1
mempool.free_all_blocks()
pinned_mempool.free_all_blocks()
print(mempool.used_bytes())              # 0
print(mempool.total_bytes())             # 0.
print(pinned_mempool.n_free_blocks())    # 0
#----------------

#---------------------------------------------
# Code #4 (This code is OK)

arr1 = cp.ones(shape1)
arr1_float = arr1.astype('float32')   
arr1_float2 = arr1_float-1                     #separating operation and storing in a different variable

arr2 = cp.ones(shape2)

del arr1_float,arr1,arr1_float2
mempool.free_all_blocks()
pinned_mempool.free_all_blocks()
print(mempool.used_bytes())              # 512
print(mempool.total_bytes())             # 512  No huge memory comsumption
print(pinned_mempool.n_free_blocks())    # 0

del arr2
mempool.free_all_blocks()
pinned_mempool.free_all_blocks()
print(mempool.used_bytes())              # 0
print(mempool.total_bytes())             # 0
print(pinned_mempool.n_free_blocks())    # 0   
#---------------------------------------------

Installation

Wheel (pip install cupy-***)

Environment

OS                           : Windows-10-10.0.19043-SP0
Python Version               : 3.9.12
CuPy Version                 : 11.2.0
CuPy Platform                : NVIDIA CUDA
NumPy Version                : 1.21.5
SciPy Version                : 1.7.3
Cython Build Version         : 0.29.32
Cython Runtime Version       : 0.29.28
CUDA Root                    : C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3
nvcc PATH                    : C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\bin\nvcc.EXE
CUDA Build Version           : 11070
CUDA Driver Version          : 11060
CUDA Runtime Version         : 11030
cuBLAS Version               : (available)
cuFFT Version                : 10402
cuRAND Version               : 10204
cuSOLVER Version             : (11, 1, 1)
cuSPARSE Version             : (available)
NVRTC Version                : (11, 3)
Thrust Version               : 101500
CUB Build Version            : 101500
Jitify Build Version         : 4a37de0
cuDNN Build Version          : (not loaded; try `import cupy.cuda.cudnn` first)
cuDNN Version                : (not loaded; try `import cupy.cuda.cudnn` first)
NCCL Build Version           : None
NCCL Runtime Version         : None
cuTENSOR Version             : 10500
cuSPARSELt Build Version     : None
Device 0 Name                : NVIDIA RTX A6000
Device 0 Compute Capability  : 86
Device 0 PCI Bus ID          : 0000:01:00.0
Device 1 Name                : NVIDIA RTX A6000
Device 1 Compute Capability  : 86
Device 1 PCI Bus ID          : 0000:4C:00.0

Additional Information

No response

Issue Analytics

State:
Created a year ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

mDiracYascommented, Nov 8, 2022

Thank you for being so kind and explaining everything to me!

Disabling a memory pool could be another option depending on the use case.

Disabling the memory pool solved my memory problem. Thank you for your advice!

You can try to use the cuda asynchronous malloc …

Thank you for sharing such an advanced solution. It looks interesting and I will try this solution to both improve speed and reduce memory consumption.

0reactions

kmaehashicommented, Nov 8, 2022

Disabling a memory pool could be another option depending on the use case. https://docs.cupy.dev/en/stable/reference/generated/cupy.get_default_memory_pool.html#cupy.get_default_memory_pool

Top Results From Across the Web

Memory leak in Temporal History service v1.18.3

During our performance testing we noticed that the History service memory usage rises over time and does not decrease, even if there are...

Memory leak on the temporal worker side · Issue #649 - GitHub

Memory leak on the temporal worker side #649 Some times ago we began using the temporal SDK in our microservice product, but after...

Everything you need to know about Memory Leaks in Android.

Memory Leaks: Temporary vs Permanent. Leaks can be divided into two categories: those that occupy the memory unit until the application terminates ......

About cumulative updates for SQL Server - Microsoft Support

Fixes an issue that causes a memory leak to occur when DATA_CONSISTENCY_CHECK is being executed for a system-versioned temporal table in SQL Server...

Detecting memory leaks in Android applications - Dropbox

If you suspect you are running into a temporal leak, a good way to check is to use Android Studio's memory profiler. Once...