question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

evaluation on multiple solutions at once causes memory leak

See original GitHub issue

Hi @xksteven , I have a question about why you advise to run the evaluation code for one solution at a time instead of doing it for all generations at once? I have added the metric to the HuggingFace hub https://huggingface.co/spaces/codeparrot/apps_metric (I didn’t change the core script testing_util.py) with evaluation done for all solutions at once and I sometimes get a memory leak for which I can’t identify the source because when I do the evaluation on the same solutions separately this doesn’t happen.

Below is the code that causes memory saturation:

from evaluate import load

generations = [["s = input()\nn = len(s)\nm = 0\n\nfor i in range(n):\n\tc = s[i]\n\tif c == '|':\n\t\tif m < 2:\n\t\t\tm = 2\n\t\telse:\n\t\t\tm += 1\n\telif c == '\\n':\n\t\tif m < 2:\n\t\t\tm = 2\n\t\telse:\n\t\t\tm += 1\n\nif m < 2:\n\tprint(-1)\nelse:\n\tprint(m * 2 - 1)\n"], ["\nx = int(input())\n\nl = list(range(x+1))\n\nm = next(l)\n\ns = sum(list([int(i) for i in str(m)]))\n\nif s > sum(list([int(i) for i in str(m)])) :\n\tm = next(l)\n\t\nprint(m)\n"]]

metric = load("codeparrot/apps_metric")

results = metric.compute(predictions=generations, level="all", debug=False)

While this works fine:

generation_1 = generations[:1]
generation_2 = generations[1:2]
results_1 = metric.compute(predictions=generation_1, level="all", debug=False)
results_2 = metric.compute(predictions=generation_2, level="all", debug=False)
print(results_1)
print(results_2)
{'avg_accuracy': 0.23185840707964603, 'strict_accuracy': 0.0, 'pass_at_k': None}
{'avg_accuracy': 0.0, 'strict_accuracy': 0.0, 'pass_at_k': None}

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:14 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
xkstevencommented, Aug 5, 2022

If after your testing you could do a PR please we’d be happy to accept it. 😃

1reaction
loubnabnlcommented, Aug 3, 2022

Hi sorry for not updating you earlier. So the memory leak happens this line (for both indexes 20 and 21 ) https://github.com/hendrycks/apps/blob/1b052764e10804ae79cf12c24801aaa818ea36ab/eval/testing_util.py#L303 the timeout there doesn’t work, it seems overwritten by call_method I didn’t manage to fix it. I used a workaround by adding a global timeout for all tests https://huggingface.co/spaces/codeparrot/apps_metric/blob/main/utils.py#L12

import json
import multiprocessing
from datasets import load_dataset
from testing_util import run_test

DATASET = "codeparrot/apps"

apps_eval = load_dataset(DATASET, split="test", difficulties=["all"])

def check_correctness(sample, generation, timeout, debug=True):
    def _temp_run(sample, generation, debug, result):
        result.append(run_test(sample, test=generation, debug=debug))

    manager = multiprocessing.Manager()
    result = manager.list()
    p = multiprocessing.Process(target=_temp_run, args=(sample, generation, debug, result))
    p.start()
    p.join(timeout=timeout + 1)
    if p.is_alive():
        p.kill()
    if not result:
        in_outs = json.loads(sample["input_output"])
        #consider that all tests failed
        result = [[-1 for i in range(len(in_outs["inputs"]))]]
        if debug:
            print(f"global timeout")
    return result[0]

generation = "\nx = int(input())\n\nl = list(range(x+1))\n\nm = next(l)\n\ns = sum(list([int(i) for i in str(m)]))\n\nif s > sum(list([int(i) for i in str(m)])) :\n\tm = next(l)\n\t\nprint(m)\n"
sample = apps_eval[1]
print(check_correctness(sample, generation, timeout=10, debug=False))

I still need to make some tests to make sure this doesn’t heavily affect the scores, but I think it shouldn’t as 10 seconds seems like a large threshold to me. Happy to open a PR if you want to add this in your repo.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How To Detect and Prevent Memory Leaks | Scout APM Blog
Timers & Events: The use of setTimeout, setInterval, Observers and event listeners can cause memory leaks when heavy object references are kept ...
Read more >
Java Memory Leaks: Solutions, Tools, Tutorials & More - Stackify
We put together this guide to help you understand how, why, and where Java memory leaks happen – and what you can do...
Read more >
Java Memory Leak Detection: Causes & Tools to Avoid Them
Learn what is a memory leak in Java and what causes it. Symptoms, types, and tools you can use to avoid, detect or...
Read more >
Common causes and ways to diagnose using onstat commands
SOLUTION Identify the source of the suspected memory leak. Two techniques are: Consider typical sources of sudden memory leaks.
Read more >
Solutions to Memory Corruption and Memory Leak
What are the solutions to memory corruption and memory leak? ... This can cause many problems, which can directly lead to downtime in...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found