question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Python3: Nuitka leaks references when class variables are accessed on class level

See original GitHub issue
nuitka version 0.5.33rc7 (factory branch)
Microsoft Windows Server 2012 Standard
Python 3.6.6 (v3.6.6:4cf1f54eb7, Jun 27 2018, 03:37:03) [MSC v.1900 64 bit (AMD64)] on win32

I am using grab.spider framework for my work, my script is simple web crawler which download web pages and find specified signs on that, grab framework uses threading, pycurl, lxml, and other libraries inside.

Now I have serious problem. When I am working with usual python bot runs at 80 threads and starts nearly from 100mb of RAM usage, then it grows to 500mb step by step, and then releases it back to ~200mb, all time I see that script takes up to 500mb and then moves down to 200mb.

Python version do this memory releases proper way, but when I am using nuitka I see only growth of memory and nothing more, nuitka not able to release memory. I tried to wait for a long time, not works, memory just goes to up and never go down.

I have tried to make simple tests to reproduce this situation, like that:

import time
import psutil
import os
import humanfriendly


def mem_usage():
    process = psutil.Process(os.getpid())
    res = process.memory_info().rss
    return res


test = {}


def doit():
    global test
    test['abc'] = "123" * 200 * 200 * 200 * 40
    print(f"used top: {humanfriendly.format_size(mem_usage())}")
    time.sleep(1)


print(f"used before: {humanfriendly.format_size(mem_usage())}")
doit()
test = None
# time.sleep(1)
print(f"used after: {humanfriendly.format_size(mem_usage())}")

but here is no problem, nuitka releases memory in this simple example. My problem located somewhere deeper, and I have lack of skills what to do next. It would be great to provice you simple example, but grab library too complex thing which too difficult for me to analyze that.

Dear @kayhayen, could you give me any advices or recommendations how it possible to handle that?

My current plan: when bot finishing work need insert pdb.set_trace() and try to call gc.collect() and see reaction, maybe memory will released

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
kayhayencommented, Sep 5, 2018

I have recently discovered that for Python3 classes there is a leak and fixed it for the factory branch. Could well have to do with your issue, if class body code is run frequently for your program. It’s going to be in a pre-release next week.

0reactions
kayhayencommented, Sep 13, 2018

This is in the current pre-release which is become a release any day now, closing therefore.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Common python mistakes #1: mixing up instance and class ...
Don't mix up Python class variables and instance variables. Here is a common Python mistake we see in PRs for robusta. · Python...
Read more >
How can I access "static" class variables within methods?
With python, the grand rule of thumb is that there are three namespaces that are searched, in order, for variables: The function/method; The ......
Read more >
What's new in PyObjC — PyObjC - the Python to Objective-C ...
objc.FILE.readline() would crash if the file is closed in Python. Instance variable descriptors can now be retrieved from a class. That is, given:....
Read more >
Python vs C++: Selecting the Right Tool for the Job
Whereas in C++ you use variables to reference values, in Python you use names. ... Both the object of a class and the...
Read more >
nuitka Changelog - pyup.io
the source code level, Nuitka immediately created constant references from them. ... propagated in classes as well, allowing for more static optimization
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found