question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Temporal memory leak

See original GitHub issue

Description

Hello, I love Cupy and would like to thank you for all the hard work you’re doing.

It seems that type conversion plus some operations causes temporal memory leak, although the deleting all variables free the memory. Please see Code 2. is this expected behavior? I hope this issue is reproducible.

To Reproduce

import cupy  as cp

mempool = cp.get_default_memory_pool()
pinned_mempool = cp.get_default_pinned_memory_pool()
shape1 = (128,1024,1024)
shape2 = (2,2)

#---------------------------------------------
# Code #1 (This code is OK)
arr1 = cp.ones(shape1)
arr1_float= arr1.astype('float32')

arr2 = cp.ones(shape2)

del arr1_float,arr1
mempool.free_all_blocks()
pinned_mempool.free_all_blocks()
print(mempool.used_bytes())              # 512
print(mempool.total_bytes())             # 512
print(pinned_mempool.n_free_blocks())    # 0

del arr2
mempool.free_all_blocks()
pinned_mempool.free_all_blocks()
print(mempool.used_bytes())              # 0
print(mempool.total_bytes())             # 0
print(pinned_mempool.n_free_blocks())    # 0

#---------------------------------------------


#---------------------------------------------
# Code #2 (This code causes a temporal memory leak?)

arr1 = cp.ones(shape1)
arr1_float = arr1.astype('float32')-1   # adding -1 to Code #1

arr2 = cp.ones(shape2)

del arr1_float,arr1
mempool.free_all_blocks()
pinned_mempool.free_all_blocks()
print(mempool.used_bytes())              # 512
print(mempool.total_bytes())             # 536870912    There is a large memory consumption in spite of deleting arr1_float and arr1.
print(pinned_mempool.n_free_blocks())    # 0

del arr2
mempool.free_all_blocks()
pinned_mempool.free_all_blocks()
print(mempool.used_bytes())              # 0
print(mempool.total_bytes())             # 0
print(pinned_mempool.n_free_blocks())    # 0   Once all variables are deleted, the memory is freed.
#---------------------------------------------

#---------------------------------------------
# Code #3 (This code is OK)


arr1 = cp.ones(shape1)
arr1_float = arr1.astype('float32')-1  

#  without arr2

del arr1_float,arr1
mempool.free_all_blocks()
pinned_mempool.free_all_blocks()
print(mempool.used_bytes())              # 0
print(mempool.total_bytes())             # 0.
print(pinned_mempool.n_free_blocks())    # 0
#----------------

#---------------------------------------------
# Code #4 (This code is OK)

arr1 = cp.ones(shape1)
arr1_float = arr1.astype('float32')   
arr1_float2 = arr1_float-1                     #separating operation and storing in a different variable

arr2 = cp.ones(shape2)

del arr1_float,arr1,arr1_float2
mempool.free_all_blocks()
pinned_mempool.free_all_blocks()
print(mempool.used_bytes())              # 512
print(mempool.total_bytes())             # 512  No huge memory comsumption
print(pinned_mempool.n_free_blocks())    # 0

del arr2
mempool.free_all_blocks()
pinned_mempool.free_all_blocks()
print(mempool.used_bytes())              # 0
print(mempool.total_bytes())             # 0
print(pinned_mempool.n_free_blocks())    # 0   
#---------------------------------------------

Installation

Wheel (pip install cupy-***)

Environment

OS                           : Windows-10-10.0.19043-SP0
Python Version               : 3.9.12
CuPy Version                 : 11.2.0
CuPy Platform                : NVIDIA CUDA
NumPy Version                : 1.21.5
SciPy Version                : 1.7.3
Cython Build Version         : 0.29.32
Cython Runtime Version       : 0.29.28
CUDA Root                    : C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3
nvcc PATH                    : C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\bin\nvcc.EXE
CUDA Build Version           : 11070
CUDA Driver Version          : 11060
CUDA Runtime Version         : 11030
cuBLAS Version               : (available)
cuFFT Version                : 10402
cuRAND Version               : 10204
cuSOLVER Version             : (11, 1, 1)
cuSPARSE Version             : (available)
NVRTC Version                : (11, 3)
Thrust Version               : 101500
CUB Build Version            : 101500
Jitify Build Version         : 4a37de0
cuDNN Build Version          : (not loaded; try `import cupy.cuda.cudnn` first)
cuDNN Version                : (not loaded; try `import cupy.cuda.cudnn` first)
NCCL Build Version           : None
NCCL Runtime Version         : None
cuTENSOR Version             : 10500
cuSPARSELt Build Version     : None
Device 0 Name                : NVIDIA RTX A6000
Device 0 Compute Capability  : 86
Device 0 PCI Bus ID          : 0000:01:00.0
Device 1 Name                : NVIDIA RTX A6000
Device 1 Compute Capability  : 86
Device 1 PCI Bus ID          : 0000:4C:00.0

Additional Information

No response

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
mDiracYascommented, Nov 8, 2022

Thank you for being so kind and explaining everything to me!

Disabling a memory pool could be another option depending on the use case.

Disabling the memory pool solved my memory problem. Thank you for your advice!

You can try to use the cuda asynchronous malloc …

Thank you for sharing such an advanced solution. It looks interesting and I will try this solution to both improve speed and reduce memory consumption.

0reactions
kmaehashicommented, Nov 8, 2022

Disabling a memory pool could be another option depending on the use case. https://docs.cupy.dev/en/stable/reference/generated/cupy.get_default_memory_pool.html#cupy.get_default_memory_pool

Read more comments on GitHub >

github_iconTop Results From Across the Web

Memory leak in Temporal History service v1.18.3
During our performance testing we noticed that the History service memory usage rises over time and does not decrease, even if there are...
Read more >
Memory leak on the temporal worker side · Issue #649 - GitHub
Memory leak on the temporal worker side #649​​ Some times ago we began using the temporal SDK in our microservice product, but after...
Read more >
Everything you need to know about Memory Leaks in Android.
Memory Leaks: Temporary vs Permanent. Leaks can be divided into two categories: those that occupy the memory unit until the application terminates ......
Read more >
About cumulative updates for SQL Server - Microsoft Support
Fixes an issue that causes a memory leak to occur when DATA_CONSISTENCY_CHECK is being executed for a system-versioned temporal table in SQL Server...
Read more >
Detecting memory leaks in Android applications - Dropbox
If you suspect you are running into a temporal leak, a good way to check is to use Android Studio's memory profiler. Once...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found