question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RAM increase slowly

See original GitHub issue

first of all. thanks for your work. when I use the grequests the RAM will increase slowly. code such as :

def exception_handler(request, exception):
    print("Request failed request:{} \n exception:{} ".format(request,exception))

if __name__ == '__main__':
    task = []
    f_file= "./data_scp/3031_xiuxiu_coverImage_v1.dat"

    session = requests.session()
    with open(f_file,"r") as r_f:
        for i in r_f:
            tmp = i.strip("\n").split(",")
            url = tmp[-1]
            feed_id = tmp[0]
            rs = grequests.request("GET", url,session=session)
            task.append(rs)

    resp = grequests.imap(task, size=30,exception_handler=exception_handler)

    for i in resp:
        if i.status_code ==200:
            print(i.status_code)

the 3031_xiuxiu_coverImage_v1.dat such as 6650058925696645684,http://***8.jpg 6650058925696645684,http://***8.jpg 6650058925696645684,http://***8.jpg 6650058925696645684,http://***8.jpg

my grequest version is 0.4.0 . thanks in advance

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
spyoungtechcommented, Apr 3, 2020

So, I’m getting closer to figuring out what is going on. Here’s a few things I’ve discovered thus far…

grequests opening (and not closing) a new session each request prevents freeing of memory

Take the following code:

@profile
def test():
    url = "https://httpbin.org/status/200"
    reqs=[grequests.get(url) for _ in range(100)]
    responses = grequests.imap(reqs, size=5)
    for resp in responses:
        ...
    print('ok')  # memory should be freed by now

Notice that the memory builds up (104MiB) and is never really released, despite no (apparent) references existing anymore. The size will also get bigger if I increase the number of requests.

Line #    Mem usage    Increment   Line Contents
================================================
     5   35.977 MiB   35.977 MiB   @profile
     6                             def test():
     7   35.977 MiB    0.000 MiB       url = "https://httpbin.org/status/200"
     8   36.477 MiB    0.062 MiB       reqs=[grequests.get(url) for _ in range(100)]
     9   36.477 MiB    0.000 MiB       responses = grequests.imap(reqs, size=10)
    10  104.605 MiB  104.605 MiB       for resp in responses:
    11  104.605 MiB    0.000 MiB           ...
    12  104.613 MiB    0.008 MiB       print('ok') # memory should be freed by now

But if I modify the function to use a requests.Session object for its session…

sesh = requests.Session()
@profile
def test():
    url = "https://httpbin.org/status/200"
    reqs=[grequests.get(url, session=sesh) for _ in range(500)]
    responses = grequests.imap(reqs, size=5)
    for resp in responses:
        ...
    print('ok')  # memory should be freed by now

With this change, there is not nearly as much buildup in memory. (the amount is partially dependent on the pool size used; bigger pool will buildup more memory).
Also, now that we’re using a session, increasing the number of requests does not increase the amount of memory built up, either. It is the same for 100 or 500 requests.

Line #    Mem usage    Increment   Line Contents
================================================
     5   36.090 MiB   36.090 MiB   @profile
     6                             def test():
     7   36.090 MiB    0.000 MiB       url = "https://httpbin.org/status/200"
     8   36.090 MiB    0.000 MiB       reqs=[grequests.get(url, session=sesh) for _ in range(500)]
     9   36.090 MiB    0.000 MiB       responses = grequests.imap(reqs, size=5)
    10   42.051 MiB   42.051 MiB       for resp in responses:
    11   42.051 MiB    0.000 MiB           ...
    12   42.059 MiB    0.008 MiB       print('ok')  # memory should be freed by now

Memory not freed due to references in request list

Using the very first code example and profiling from the previous section, (which does not use a session) another issue with freeing memory is seen

    10  104.605 MiB  104.605 MiB       for resp in responses:
    11  104.605 MiB    0.000 MiB           ...
    12  104.613 MiB    0.008 MiB       print('ok') # memory should be freed by now

By the time print('ok') runs, the generator has been exhausted and it SHOULD have freed up memory, but it doesn’t. This is because the request list is still holding onto references, preventing garbage collection.

adding del reqs allows the memory to be freed once the generator is exhausted.

@profile
def test():
    url = "https://httpbin.org/status/200"
    reqs=[grequests.get(url) for _ in range(100)]
    responses = grequests.imap(reqs, size=5)
    del reqs
    for resp in responses:
        ...
    print('ok')  # memory should be freed by now

With the references from the request list removed, memory is now freed (more) properly.

Line #    Mem usage    Increment   Line Contents
================================================
     5   35.977 MiB   35.977 MiB   @profile
     6                             def test():
     7   35.977 MiB    0.000 MiB       url = "https://httpbin.org/status/200"
     8   36.477 MiB    0.062 MiB       reqs=[grequests.get(url) for _ in range(100)]
     9   36.477 MiB    0.000 MiB       responses = grequests.imap(reqs, size=5)
    10   36.477 MiB    0.000 MiB       del reqs
    11  104.176 MiB  104.176 MiB       for resp in responses:
    12  104.176 MiB    0.004 MiB           ...
    13   56.660 MiB    0.000 MiB       print('ok')  # memory should be freed by now

A yet remaining problem…

    13   56.660 MiB    0.000 MiB       print('ok')  # memory should be freed by now

Notice that, while we are freeing some memory, not everything is freed up. Specifically we have 56 MiB at the end of this function, but it should be closer to the ~36 MiB we started with. This number increases with the number of requests. (with 500 requests, ~86 MiB will be left).

Since you’re already using a session, I think whatever is holding on to this little bit of memory that’s building up is causing your memory leak. I’m still working on figuring out exactly what that is!

0reactions
spyoungtechcommented, Apr 25, 2020

That does sound strange. Unfortunately, I have no idea why it would stop suddenly. I’ve tested locally with as many or more requests, and it never flat out stops.

I have run into similar strange issues in the past though. Perhaps considering updating/changing the version of gevent and/or the version of Python you’re using and see if that changes anything. That’s really just a guess, though.

Read more comments on GitHub >

github_iconTop Results From Across the Web

PC memory usage slowly increasing, even when idle
That level of memory usage is not normal and the increasing use over time is typical of a memory leak. This is not...
Read more >
Memory usage increases slowly and receive high memory ...
I have observed that with continuous usage (around 5-6 hours), memory usage increases slowly and I receive high memory usage warning.
Read more >
How To Fix Your Super Slow Computer - Popular Mechanics
First, Try Restarting Your Computer​​ But over time, that RAM memory fills up, which will slow your computer down in the long-run. As...
Read more >
Does More RAM Make Your Computer Faster?
More RAM doesn't necessarily make your computer work faster. The bottleneck usually stems from a slow hard disk drive or an outdated Wi-Fi...
Read more >
How Random Access Memory (RAM) affects performance - Dell
If your computer is running slowly due to a lack of RAM, you might be tempted to increase virtual memory because it is...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found